“FOLLOW_set_in_”… is undefined in generated parser

2019-04-25 06:27发布

问题:

I have written a grammar for vaguely Java-like DSL. While there are still some issues with it (it doesn't recognize all the inputs as I would want it to), what concerns me most is that the generated C code is not compilable.

I use AntlrWorks 1.5 with Antlr 3.5 (Antlr 4 apparently does not support C target).

The problem is with expression rules. I have rules prio14Expression to prio0Expression which handle operator precedence. To problem is at priority 2, which evaluates prefix and postfix operators:

...
prio3Expression: prio2Expression (('*' | '/' | '%') prio2Expression)*;

prio2Expression: ('++' | '--' | '!' | '+' | '-')* prio1Expression ('++' | '--')*;  

prio1Expression:
    prio0Expression (
        ('.' prio0Expression) |
        ('(' (expression (',' expression)*)? ')') |
        ('[' expression (',' expression)* ']')
    )*;

prio0Expression: 
    /*('(') => */('(' expression ')') |
    IDENTIFIER |
    //collectionLiteral |
    coordinateLiteral |
    'true' |
    'false' |
    NUMBER |
    STRING 
    ;
...

Expression is a label for prio14Expression. You can see the full grammar here.

The code generation itself is successful (without any errors or serious warnings). It generates following code:

CONSTRUCTEX();
EXCEPTION->type         = ANTLR3_MISMATCHED_SET_EXCEPTION;
EXCEPTION->name         = (void *)ANTLR3_MISMATCHED_SET_NAME;
EXCEPTION->expectingSet = &FOLLOW_set_in_prio2Expression962;

RECOVERFROMMISMATCHEDSET(&FOLLOW_set_in_prio2Expression962);
goto ruleprio2ExpressionEx;

Which does not build with error "Error 5 error C2065: 'FOLLOW_set_in_prio2Expression962' : undeclared identifier".

Did I do something wrong in the grammar? No other rules cause this error and if I somewhat reformulate the rule concerned, the generated code is valid (but then the grammar doesn't do what I want it to). What can I do to fix this issue?

Thanks for any help.

回答1:

I encountered same problem.

I think it happens if parser rule has a part of simple OR-ed token like this:

problem_case: problematic_rule;
problematic_rule: 'A' | 'B' ;

This doesn't happen if it is lexer rule.

workaround1: As_lexer_rule;
As_lexer_rule: 'A' | 'B' ;

Or, if it is complicated rule (not simple OR-ed token).

workaround2: make_it_complicated_needlessly;
make_it_complicated_needlessly: 'A' | 'B' | {false}? NeverUsedRule;
NeverUsedRule: /* don't care*/ ;

( I used semantic predicate "{false}?" for this modification. I believe it doesn't change the grammar of target language.)



回答2:

it seems to be an old post, but yet, maybe it's still useful for someone (as it was for me).

I encountered the same problem with the C runtime of antlr 3.5.

another easy workaround, that does not change the grammar:

problem_case: problematic_rule;
problematic_rule: a='A' | b='B' ;