I have a small custom scripting language, and I am trying to update it to allow boolean expressions such as a > 2
and a > 2 and (b < 3 or c > 5)
. It's the parenthetical expressions that I am having trouble with here.
Here is a (edited since the original post based on the answer from @Bart Kiers) full grammar that exhibits the problem. This is a pared-down version of my actual grammar, but the problem occurs here too.
grammar test;
options {
language = 'JavaScript';
output = AST;
}
statement
: value_assignment_statement
EOF
;
value_assignment_statement
: IDENT
'='
expression
;
value_expression
: value_list_expression
| IDENT
;
value_list_expression
: value_enumerated_list
;
value_enumerated_list : '{' unary+ '}'
;
term
: LPAREN expression RPAREN
| INTEGER
| value_expression
;
unary : ( '+' | '-' )* term
;
mult : unary ( ('*' | '/') unary)*
;
expression : mult ( ('+' | '-') mult )*
;
boolean
: boolean_expression
EOF
;
boolean_expression
: boolean_or_expression
;
boolean_or_expression
: boolean_and_expression (OR boolean_and_expression)*
;
boolean_and_expression
: boolean_rel_expression (AND boolean_rel_expression)*
;
boolean_rel_expression
: boolean_neg_expression relational_operator boolean_neg_expression
;
boolean_neg_expression
: (NOT)? atom
;
atom
: LPAREN boolean_expression RPAREN
//| expression
;
relational_operator : '=' | '>' | '<';
LPAREN : '(';
RPAREN : ')';
AND : 'and';
OR : 'or';
NOT : 'not';
IDENT : LETTER LETTER+;
INTEGER : DIGIT+;
WS : (' ' | '\n' | '\r' | '\t')+ { $channel = HIDDEN; };
fragment DIGIT : '0'..'9';
fragment LETTER : ('a'..'z' | 'A'..'Z');
My attempt to accommodate parenthetical boolean expressions such as a > 2 or (b < 3)
is in the commented-out line in the atom
rule. When I uncomment this line and include it in the grammar, ANTLR gives me this error:
[fatal] rule atom has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2. Resolve by left-factoring or using syntactic predicates or using backtrack=true option.
I would like to address this by removing the recursion, but I can't seem to make the transition from the Wikipedia description on how to remove left recursion to my own stuff.
In using this grammar, I want sometimes to use statement
as a root with input such as abc = 2 + 3
, which assigns a value to a variable named abc. Other times I want to use the grammar to evaluate an expression with boolean
as the root with input such as abc > 3 and (xyz < 5 or xyz > 10)
. When I tried to use @Bart's answer as a model, it worked fine until I tried to merge the parts of the grammar used by statement
with the parts used by boolean
. They should both be able to use an expression
, but that's where I'm stuck with this left recursion error.
So, how can I both handle parentheses and avoid the left recursion problem?
Boolean expressions are just the same as the additive- and multiplicative expression, and should therefore not be separated from them. Here's how to account for all types of expressions:
which will parse the example input:
into the following parse tree: