Pyparsing using Forward to parse recursive expressions

I am trying to parse an expression recursively. I have followed several tutorials and it seems to me that Forward () is the class I need. However, something seemingly is just causing me problems.

Here is the code I wrote

from pyparsing import *
exp = Forward()
integer = Word(nums)
exp << (integer | (exp + '+' + exp))
input = "1+1"
print exp.parseString(input)

      

I want it to return ['1','+','1']

, but only returns['1']

Help is greatly appreciated.

+3


source to share


1 answer


There are several questions here. In ascending order of importance:

  • parseString will not throw an exception if there is additional text after the parsed content. Useexp.parseString(input, parseAll=True)

  • '|' it is MatchFirst, not MatchLongest. Since your whole is first, it will be matched first. Then the parser itself does not work for "+". If you want the longest match, use the "^" operator.
  • The Biggie: Once you convert to '^' (or reorder expressions to put exp + exp

    first, in front of an integer), you find yourself blowing up the maximum recursion depth. This is because this analyzer has a left recursive definition exp

    . That is, in order to analyze exp

    , it must analyze exp

    , for which it must analyze exp

    , etc. In general, many published BNFs use recursion to describe such a repeating structure, but pyparsing doesn't do the necessary lookahead / backtrack for this to work. Try it exp <<= integer + ZeroOrMore('+' + integer) | '(' + exp + ')'

    - this expression is not left-recursive, since exp

    you must pass the opening parenthesis before parsing the nested ones .

EDIT: Sorry, I was too quick in my previous sentence, here is the correct way to parse a recursive expression:

from pyparsing import *

exp = Forward()
LPAR, RPAR = map(Suppress, "()")
integer = Word(nums)
term = integer | Group(LPAR + exp + RPAR)
exp << term + ZeroOrMore('+' + term)

input = "(1+1) + 1"
print(exp.parseString(input))

      

prints



[['1', '+', '1'], '+', '1']

      

If you follow through the code, you'll see recursion: exp

defined with term

, and term

defined with exp

. An example fourFn.py

is close to this style; since writing this, I've added a method infixNotation

in pyparsing that will allow you to write:

exp = infixNotation(integer, [
                    ('+', 2, opAssoc.LEFT),
                    ])

      

infixNotation

performs internal recursive definitions, implicitly defines an expression, '(' + exp + ')'

and simplifies the implementation of an operator system with operation precedence.

0


source







All Articles