Pyparsing using Forward to parse recursive expressions
I am trying to parse an expression recursively. I have followed several tutorials and it seems to me that Forward () is the class I need. However, something seemingly is just causing me problems.
Here is the code I wrote
from pyparsing import *
exp = Forward()
integer = Word(nums)
exp << (integer | (exp + '+' + exp))
input = "1+1"
print exp.parseString(input)
I want it to return ['1','+','1']
, but only returns['1']
Help is greatly appreciated.
There are several questions here. In ascending order of importance:
- parseString will not throw an exception if there is additional text after the parsed content. Use
exp.parseString(input, parseAll=True)
- '|' it is MatchFirst, not MatchLongest. Since your whole is first, it will be matched first. Then the parser itself does not work for "+". If you want the longest match, use the "^" operator.
- The Biggie: Once you convert to '^' (or reorder expressions to put
exp + exp
first, in front of an integer), you find yourself blowing up the maximum recursion depth. This is because this analyzer has a left recursive definitionexp
. That is, in order to analyzeexp
, it must analyzeexp
, for which it must analyzeexp
, etc. In general, many published BNFs use recursion to describe such a repeating structure, but pyparsing doesn't do the necessary lookahead / backtrack for this to work. Try itexp <<= integer + ZeroOrMore('+' + integer) | '(' + exp + ')'
- this expression is not left-recursive, sinceexp
you must pass the opening parenthesis before parsing the nested ones .
EDIT: Sorry, I was too quick in my previous sentence, here is the correct way to parse a recursive expression:
from pyparsing import *
exp = Forward()
LPAR, RPAR = map(Suppress, "()")
integer = Word(nums)
term = integer | Group(LPAR + exp + RPAR)
exp << term + ZeroOrMore('+' + term)
input = "(1+1) + 1"
print(exp.parseString(input))
prints
[['1', '+', '1'], '+', '1']
If you follow through the code, you'll see recursion: exp
defined with term
, and term
defined with exp
. An example fourFn.py
is close to this style; since writing this, I've added a method infixNotation
in pyparsing that will allow you to write:
exp = infixNotation(integer, [
('+', 2, opAssoc.LEFT),
])
infixNotation
performs internal recursive definitions, implicitly defines an expression, '(' + exp + ')'
and simplifies the implementation of an operator system with operation precedence.