Note: this is a more detailed version of Recursive Descent precedence parsing missing prefix expression
I'm building a simple language parser, and having an issue with lower precedence prefix expressions. Here's an example grammar:
E = E8
E8 = E7 'OR' E8 | E7
E7 = E6 'XOR' E7 | E6
E6 = E5 'AND' E6 | E5
E5 = 'NOT' E5 | E4
E4 = E3 '==' E4 | E3 '!=' E4 | E3
E3 = E2 '<' E3 | E2 '>' E3 | E2
E2 = E1 '+' E2 | E1 '-' E2 | E1 '*' E2 | E1 '+' E2 | E1
E1 = '(' E ')' | 'true' | 'false' | '0'..'9'
However, this grammar doesn't work correctly for the NOT, if it's used as the RHS of a higher precedence infix operator, i.e.:
true == NOT false
This is due to the == operator requiring E3
on the RHS, which cannot be a 'NOT' operation.
I'm unsure the correct way to express this grammar? Is it still possible using this simplistic recursive descent approach, or will I need to move to a more featured algorithm (shunting yard or precedence climbing).
Here are some examples that would need to parse correctly:
- input
true == 1 < 2
, output==(true, <(1, 2))
- input
1 < 2 == true
, output==(<(1, 2), true)
- input
NOT true == false
, outputNOT(==(true, false))
- input
true == NOT false
, output==(true, NOT(false))
** doesn't work - input
true < NOT false
, output<(true, NOT(false))
** doesn't work
I have attempted to alter the levels E4
, E3
, and E2
to use E5
on the RHS of the infix expression, as suggested in Recursive Descent precedence parsing missing prefix expression (i.e. E3 '==' E5
, E3 '<' E5
, etc). However this breaks the precedence between these levels, i.e. true == 1 < 2
would be incorrectly parsed as
<(==(true, 1), 2)`.
When sticking to the way your language is defined, you cannot have
as a valid term in your language. Because then
would be ambigous: the parse tree could be either
or
Note that
is a valid term in your language. A probably more intuitive definition of your language would be to put the
NOT
-level fromE5
down toE2
. Thenare both valid and
NOT
binds withfalse
. And the alternative meaning of the second expression would be expressed asIf these options still do not satisfy you, you have to change/extend the tool. E.g. the yacc/bison parser allows to explicitely define operator precedences; see e.g. here