I have following tokens and many more, but I want to keep my question short that's why not including the whole code.
tokens = (
'COMMA',
'OP',
'FUNC1',
'FUNC2'
)
def t_OP(t):
r'&|-|\||,'
return t
def t_FUNC1(t):
r'FUNC1'
return t
def t_FUNC2(t):
r'FUNC2'
return t
Other methods:
def FUNC1(param):
return {'a','b','c','d'}
def FUNC2(param,expression_result):
return {'a','b','c','d'}
My grammar rules in YACC are and few more are there but listed important ones:
'expression : expression OP expression'
'expression : LPAREN expression RPAREN'
'expression : FUNC1 LPAREN PARAM RPAREN'
'expression : FUNC2 LPAREN PARAM COMMA expression RPAREN'
'expression : SET_ITEM'
In my yacc.py, below are the methods which are related to the issue:
def p_expr_op_expr(p):
'expression : expression OP expression'
if p[2] == '|' or p[2]== ',':
p[0] = p[1] | p[3]
elif p[2] == '&':
p[0] = p[1] & p[3]
elif p[2] == '-':
p[0] = p[1] - p[3]
def p_expr_func1(p):
'expression : FUNC1 LPAREN PARAM RPAREN'
Param = p[3]
Result = ANY(Param)
p[0] = Result
def p_expr_func2(p):
'expression : FUNC2 LPAREN PARAM COMMA expression RPAREN'
Param = p[3]
expression_result = p[5]
Result = EXPAND(Param,expression_result)
p[0] = Result
def p_expr_set_item(p):
'expression : SET_ITEM'
p[0] = {p[1]}
So, the issue is:
If I give below input expression to this grammar:
FUNC1("foo"),bar
-- it works properly, and give me the result as the UNION of the SET returned by FUNC1("foo") and bar => {a,b,c,d} | {bar}
But, if i give below input expression, it gives syntax error at , and ): I have my parenthesis defined as tokens (for those who think may be brackets are not defined in tokens)
FUNC2("foo", FUNC1("foo"),bar)
According to me for this expression, it matches production rule 'expression : FUNC2 LPAREN PARAM COMMA expression RPAREN'
so everything after the first comma should be well treated as a expression and it should match 'expression : expression OP expression'
and do the union when comma is encountered as a operator.
If that's the case, then it should not work for FUNC1("foo"),bar
as well.
I know I can fix this issue by removing ',' from t_OP(t) and adding one more production rule as 'expression : expression COMMA expression'
and the method for this rule will look like below:
def p_expr_comma_expr(p):
'expression : expression COMMA expression'
p[0] = p[1] | p[3]
I'm reluctant to include this rule because it will introduces '4 shift/reduce conflicts'.
I really want to understand why it executes in one case and why not the other and what's the way to consider ',' as a operator?
Thanks