Antlr4 matches whole input line or bust

I'm new to Antlr4 and have been fooling my brain about behavior that I just don't understand for days. I have the following combined grammar and expect it to fail and report an error, but it doesn't:

grammar MWE;
parse: cell EOF;
cell: WORD;
WORD: ('a'..'z')+;

      

If I submit it, enter

a4

      

I expect it to be unable to parse it because I want it to match the entire input string, not just a part of it, as denoted EOF

. But instead, it does not report an error (I am listening for errors with the error manager implementing the interface IAntlrErrorListener

) and gives me the following parse tree:

(parse (cell a) <EOF>)

      

Why is this?

+3


source to share


1 answer


The mechanism for recovering errors when an input is reached that does not match any lexer rule is to omit the character and continue from the next one. In your case, lexer is discarding the character 4

, so your parser sees the equivalent of this input:

a

      



The solution is to instruct the lexer to create a token for the discarded character, not ignore it, and pass that token to the parser where an error will be reported. In grammar this rule takes the following form and is always added as the last rule in grammar. If you have multiple lexer modes, the rule with this form should appear as the last rule in the default mode, as well as the last rule in each additional mode.

ErrChar
  : .
  ;

      

+2


source







All Articles