Omit remaining input in Happy (parser generator for Haskell)
According to Pascal's grammar, the program ends with a dot. And if anything after that, Free Pascal (FPC / Lazarus) omits the remaining characters.
I need similar behavior. I am using a custom monadic tokenizer and it is lazy, so I just want Happy not to invoke a continuation when the main rule succeeds.
Essentially, I would like it to be like this:
Program : Header Decls Body '.' SKIP_THE_REMAINING_INPUT { ... }
It is important that no tokenization occurs at all after this last point has been parsed, because this can cause errors.
source to share
So I found a solution.
There, a feature called partial parsing in Happy is documented , although I found that it reads the git log
original repository. This allows the parser to discard the remaining input. It is declared using a directive other than %name
:
%name parser {- normal parser -}
%partial parser {- partial parser -}
But the way it works doesn't match my second requirement: it shouldn't force the lazy tokenizer to consume the input further. Instead, it requires exactly one more token to verify that there is nothing more to parse.
Suppose it is !
not a valid character and the tokenizer cannot use it, and consider the following inputs:
-
begin end. valid_token!!!
-
begin end.!
Parsing (1) succeeds because Happy checks valid_token
and stops there, but parsing (2) fails because another token is needed (and the tokenizer cannot give one).
There seems to be no way to change this behavior, so my workaround is to represent the lexical error with a special token that doesn't appear anywhere in the grammar. Thus, when the tokenizer encounters !
(or any other invalid character), it gives a special error token. It should also help in recovering from lexical errors.
source to share