How to rewrite grammar to eliminate shift-shortening conflict (in Haskell Happy parser)
I am trying to define a grammar for methods (java like) using Happy LALR Parser Generator
1. MD ::= some_prefix { list(VD) list(S) }
2. VD ::= T I
3. S ::= I = E | I [ E ] = E | etc...
4. T ::= I | byte | int | etc...
5. E ::= INT | E + E | etc...
Here
- MD: Method Declaration
- VD: Variable declaration
- S: Statement
- T: Enter
- I: Identifier
- E: expression
All other tokens are terminals.
Inside the method, variable declarations are executed at the top, and after that, the declarations.
As you can see, VD can start with I, as there can be variable declarations of type class, where type is an identifier (I). The statement can also be started from i due to variable assignments and the variable name is I The problem is that VD and S can start with I. So in the first release it causes a shift / decrease conflict.
Is there a way to rewrite the grammar or any other parser generator tricks to solve this problem?
I have specified associativity and precedence for operators. I only mentioned a minimal set of information to explain the problem. Let me know if you need more information.
UPDATE:
Below is the grammar file
{
module Issue where
import Lexer
}
%name parser
%tokentype { Token }
%error { parseError }
%token
--===========================================================================
';' { TokenSemi }
id { TokenId $$ }
'{' { TokenLBrace }
'}' { TokenRBrace }
public { TokenPublickw }
'(' { TokenLParen }
')' { TokenRParen }
int { TokenInt $$ }
plus { TokenPlus }
inttype { TokenIntkw }
'=' { TokenAssign }
--===========================================================================
--===========================================================================
-- Precedence and associativity. Reference:
-- http://introcs.cs.princeton.edu/java/11precedence/
%right '='
%left plus
--===========================================================================
%%
MethodDecl :
public id '(' ')' '{' list(VarDecl) list(Statement) '}'
{ MethodDecl $2 (VarList $6) (BlockStatement $7) }
DataType :
inttype
{ DataType TypeInt }
| id
{ DataType (TypeClass $1) }
VarDecl :
DataType id ';'
{ VarDecl $1 $2 }
Statement :
id '=' Expression ';'
{ Assignment $1 $3 }
Expression :
int
{ IntLiteral $1 }
| Expression plus Expression
{ PlusExp $1 $3 }
--============================================================================
list1(p) :
p
{ [$1] }
| p list1(p)
{ $1 : $2 }
list(p) :
list1(p)
{ $1 }
| -- epsilon
{ [] }
--============================================================================
{
data AST = Goal AST [AST]
| BlockStatement [AST]
| IntLiteral Int
| PlusExp AST AST
| MethodDecl String AST AST
| DataType MJType
| Identifier String
| VarList [AST]
| VarDecl AST String
| Assignment String AST
deriving Show
data MJType = TypeInt
| TypeUnknown
| TypeClass String
deriving (Show,Eq)
parseError :: [Token] -> a
parseError (t:ts) = error ("Parser Error: " ++ (show t))
}
.info generated by batch Happy with details of conflicts and shift-shift states.
-----------------------------------------------------------------------------
Info file generated by Happy Version 1.19.4 from issue.y
-----------------------------------------------------------------------------
state 7 contains 1 shift/reduce conflicts.
state 9 contains 1 shift/reduce conflicts.
-----------------------------------------------------------------------------
Grammar
-----------------------------------------------------------------------------
%start_parser -> MethodDecl (0)
MethodDecl -> public id '(' ')' '{' list(VarDecl) list(Statement) '}' (1)
DataType -> inttype (2)
DataType -> id (3)
VarDecl -> DataType id ';' (4)
Statement -> id '=' Expression ';' (5)
Expression -> int (6)
Expression -> Expression plus Expression (7)
list(Statement) -> list1(Statement) (8)
list(Statement) -> (9)
list(VarDecl) -> list1(VarDecl) (10)
list(VarDecl) -> (11)
list1(Statement) -> Statement (12)
list1(Statement) -> Statement list1(Statement) (13)
list1(VarDecl) -> VarDecl (14)
list1(VarDecl) -> VarDecl list1(VarDecl) (15)
-----------------------------------------------------------------------------
Terminals
-----------------------------------------------------------------------------
';' { TokenSemi }
id { TokenId $$ }
'{' { TokenLBrace }
'}' { TokenRBrace }
public { TokenPublickw }
'(' { TokenLParen }
')' { TokenRParen }
int { TokenInt $$ }
plus { TokenPlus }
inttype { TokenIntkw }
'=' { TokenAssign }
-----------------------------------------------------------------------------
Non-terminals
-----------------------------------------------------------------------------
%start_parser rule 0
MethodDecl rule 1
DataType rules 2, 3
VarDecl rule 4
Statement rule 5
Expression rules 6, 7
list(Statement) rules 8, 9
list(VarDecl) rules 10, 11
list1(Statement) rules 12, 13
list1(VarDecl) rules 14, 15
-----------------------------------------------------------------------------
States
-----------------------------------------------------------------------------
State 0
public shift, and enter state 2
MethodDecl goto state 3
State 1
public shift, and enter state 2
State 2
MethodDecl -> public . id '(' ')' '{' list(VarDecl) list(Statement) '}' (rule 1)
id shift, and enter state 4
State 3
%start_parser -> MethodDecl . (rule 0)
%eof accept
State 4
MethodDecl -> public id . '(' ')' '{' list(VarDecl) list(Statement) '}' (rule 1)
'(' shift, and enter state 5
State 5
MethodDecl -> public id '(' . ')' '{' list(VarDecl) list(Statement) '}' (rule 1)
')' shift, and enter state 6
State 6
MethodDecl -> public id '(' ')' . '{' list(VarDecl) list(Statement) '}' (rule 1)
'{' shift, and enter state 7
State 7
MethodDecl -> public id '(' ')' '{' . list(VarDecl) list(Statement) '}' (rule 1)
id shift, and enter state 12
(reduce using rule 11)
'}' reduce using rule 11
inttype shift, and enter state 13
DataType goto state 8
VarDecl goto state 9
list(VarDecl) goto state 10
list1(VarDecl) goto state 11
State 8
VarDecl -> DataType . id ';' (rule 4)
id shift, and enter state 19
State 9
list1(VarDecl) -> VarDecl . (rule 14)
list1(VarDecl) -> VarDecl . list1(VarDecl) (rule 15)
id shift, and enter state 12
(reduce using rule 14)
'}' reduce using rule 14
inttype shift, and enter state 13
DataType goto state 8
VarDecl goto state 9
list1(VarDecl) goto state 18
State 10
MethodDecl -> public id '(' ')' '{' list(VarDecl) . list(Statement) '}' (rule 1)
id shift, and enter state 17
'}' reduce using rule 9
Statement goto state 14
list(Statement)goto state 15
list1(Statement)goto state 16
State 11
list(VarDecl) -> list1(VarDecl) . (rule 10)
id reduce using rule 10
'}' reduce using rule 10
State 12
DataType -> id . (rule 3)
id reduce using rule 3
State 13
DataType -> inttype . (rule 2)
id reduce using rule 2
State 14
list1(Statement) -> Statement . (rule 12)
list1(Statement) -> Statement . list1(Statement) (rule 13)
id shift, and enter state 17
'}' reduce using rule 12
Statement goto state 14
list1(Statement)goto state 23
State 15
MethodDecl -> public id '(' ')' '{' list(VarDecl) list(Statement) . '}' (rule 1)
'}' shift, and enter state 22
State 16
list(Statement) -> list1(Statement) . (rule 8)
'}' reduce using rule 8
State 17
Statement -> id . '=' Expression ';' (rule 5)
'=' shift, and enter state 21
State 18
list1(VarDecl) -> VarDecl list1(VarDecl) . (rule 15)
id reduce using rule 15
'}' reduce using rule 15
State 19
VarDecl -> DataType id . ';' (rule 4)
';' shift, and enter state 20
State 20
VarDecl -> DataType id ';' . (rule 4)
id reduce using rule 4
'}' reduce using rule 4
inttype reduce using rule 4
State 21
Statement -> id '=' . Expression ';' (rule 5)
int shift, and enter state 25
Expression goto state 24
State 22
MethodDecl -> public id '(' ')' '{' list(VarDecl) list(Statement) '}' . (rule 1)
%eof reduce using rule 1
State 23
list1(Statement) -> Statement list1(Statement) . (rule 13)
'}' reduce using rule 13
State 24
Statement -> id '=' Expression . ';' (rule 5)
Expression -> Expression . plus Expression (rule 7)
';' shift, and enter state 26
plus shift, and enter state 27
State 25
Expression -> int . (rule 6)
';' reduce using rule 6
plus reduce using rule 6
State 26
Statement -> id '=' Expression ';' . (rule 5)
id reduce using rule 5
'}' reduce using rule 5
State 27
Expression -> Expression plus . Expression (rule 7)
int shift, and enter state 25
Expression goto state 28
State 28
Expression -> Expression . plus Expression (rule 7)
Expression -> Expression plus Expression . (rule 7)
';' reduce using rule 7
plus reduce using rule 7
-----------------------------------------------------------------------------
Grammar Totals
-----------------------------------------------------------------------------
Number of rules: 16
Number of terminals: 11
Number of non-terminals: 10
Number of states: 29
source to share
I found a solution. Instead of using two different lists
list(VarDecl) list(Statement)
use one list
ordered_lists(VarDecl,Statements)
Below is the definition ordered_lists
.
--A list of p followed by a list of q, where each list can be an empty list
ordered_lists(p,q) :
ordered_lists1(p,q)
{ $1 }
| -- epsilon
{ ([], []) }
--This list should contain atleast one of p or q. A list of p followed by a
--list of q, where at most one list can be an empty list
ordered_lists1(p,q) :
p
{ ([$1], []) }
| q
{ ([], [$1]) }
| p ordered_lists1(p,q)
{ ($1:fst($2), snd($2)) }
| q list1(q)
{ ([], $1 : $2) }
The definition is list1
available in the question description. Please feel free to post a better answer.
source to share
At first glance, it seems like the conflict with decreasing shift might be an expression of grammar in 5.
Check out the grammar 2.3 Using Preprints in the Happy Manual. You can also use the classic approach:
E => E + T | T
T => T * F | F
F => INT | ( E )
source to share