How to make keywords recognizable in simpleparse?
I am trying to create a parser using simpleparse. I have defined the grammar as follows:
<w> := [ \n]*
statement_list := statement,(w,statement)?
statement := "MOVE",w,word,w,"TO",w,(word,w)+
word := [A-Za-z],[A-Za-z0-9]*,([-]+,[A-Za-z0-9]+)*
Now if I try to parse the line
MOVE ABC-DEF TO ABC
MOVE DDD TO XXX
The second statement is being interpreted as the parameters of the first ... It sucks and is obviously not what I want. I was able to get this working using pyparsing like this:
word = Word(alphas,alphanums+'-')
statement = "MOVE"+word+"TO"+word
statement_list = OneOrMore(statement.setResultsName('statement',True))
Is there a way to make this work in simpleparse as well?
EDIT: explanation below
I am not trying to achieve linear grammar. What I would like to see is parsed:
Simple case
MOVE AA TO BB
More complex case
MOVE AA TO BB
CC DD
EE FF
Several of the above statements
MOVE AA TO BB
CC
MOVE CC TO EE
MOVE EE TO FF
GG
HH IIJJK
source to share
Currently, the grammar is ambiguous. On paper, you can't make out if "MOVE A TO B MOVE C TO D" is two statements or one statement with some ill-named addresses.
You have two answers. You might also like.
-
You are clearly making your WORD inconsistent with the reserved word. That is, you specifically disallow MOVE or TO matches. This is equivalent to saying "MOVE is not a valid parameter name." This results in the "MOVE TL TO TM TN TO" error.
-
You change your grammar so you can determine where the statement ends. You can add the comma "MOVE AA TO BB, CC MOVE TM TO TN, TO, TP". You can add semi-columns or blank lines at the end of instructions. You can require MOVE to be the smallest indentation like Python.
source to share