I have a priority problem in my grammar, and I don't have any more idea to fix it.
I'm using Lark
Here is the thing (I have simplified the problem as much as I can):
from lark import Lark
parser = Lark(r"""
start: set | set_mul
set_mul: [nb] set
set: [nb] "foo"
nb: INT "x"
%import common.INT
%import common.WS
%ignore WS
""", start='start')
input = "3xfoo"
p = parser.parse(input)
print(p.pretty())
The output is :
start
set_mul
set
nb 3
But what I want is :
start
set_mul
nb 3
set
I tried to put priority in my rules, but it's not working.
Do you have any idea of what I would need to change to make it work ?
Thanks
A simple solution might be to re-write your grammar to remove the ambiguity.
parser = Lark(r"""
start: set | set_mul
set_mul: nb | nb set | nb nb_set
set: "foo"
nb_set: nb set
nb: INT "x"
%import common.INT
%import common.WS
%ignore WS
""", start='start')
This way, each of the following inputs has only one possible interpretation:
input = "3xfoo"
p = parser.parse(input)
print(p.pretty())
input = "3x4xfoo"
p = parser.parse(input)
print(p.pretty())
Result:
start
set_mul
nb 3
set
start
set_mul
nb 3
nb_set
nb 4
set
This is not a full answer, but gets you part way I hope. Your problem is that your grammar is ambiguous and the example you use hits that ambiguity head-on. Lark chooses to disambiguate for you, and you get the result you. see.
Make Lark not disambiguate, like this by adding ambiguity='explicit'
:
import lark
parser = lark.Lark(r"""
start: set | set_mul
set_mul: [nb] set
set: [nb] "foo"
nb: INT "x"
%import common.INT
%import common.WS
%ignore WS
""", start='start',ambiguity='explicit')
input = "3xfoo"
p = parser.parse(input)
print(p.pretty())
and you get this output which includes the one you want:
_ambig
start
set
nb 3
start
set_mul
set
nb 3
start
set_mul
nb 3
set
How can you encourage Lark to disambiguate to your preferred out? Good question.