ANTLR String interpolation

2020-03-30 02:52发布

问题:

I'm trying to write an ANTLR grammar that parses string interpolation expressions such as:

my.greeting = "hello ${your.name}"

The error I get is:

line 1:31 token recognition error at: 'e'
line 1:34 no viable alternative at input '<EOF>'

MyParser.g4:

parser grammar MyParser;

options { tokenVocab=MyLexer; }

program: variable EQ expression EOF;
expression: (string | variable);
variable: (VAR DOT)? VAR;
string: (STRING_SEGMENT_END expression)* STRING_END;

MyLexer.g4:

lexer grammar MyLexer;

START_STR: '"' -> more, pushMode(STRING_MODE) ;
VAR: (UPPERCASE|LOWERCASE) ANY_CHAR*;
EQ: '=';
DOT: '.';

WHITE_SPACE: (SPACE | NEW_LINE | TAB)+ -> skip;

fragment DIGIT: '0'..'9';
fragment LOWERCASE: 'a'..'z';
fragment UPPERCASE: 'A'..'Z';
fragment ANY_CHAR: LOWERCASE | UPPERCASE | DIGIT;
fragment NEW_LINE: '\n' | '\r' | '\r\n';
fragment SPACE: ' ';
fragment TAB: '\t';

mode INTERPOLATION_MODE;

STRING_SEGMENT_START: '}' -> more, popMode;

mode STRING_MODE;

STRING_END: '"' -> popMode;
STRING_SEGMENT_END: '${' -> pushMode(INTERPOLATION_MODE);
TEXT : ~["$]+ -> more ;

Expressions like the following work fine:

my.greeting = "hello"
my.greeting = "hello ${} world"

Any ideas what I might be doing wrong?

回答1:

Instead of:

mode INTERPOLATION_MODE;

STRING_SEGMENT_START: '}' -> more, popMode;
I_VAR: (UPPERCASE|LOWERCASE) ANY_CHAR*;
I_DOT: '.';

...

variable: ((VAR|I_VAR) (DOT|I_DOT))? (VAR|I_VAR);

you could try:

mode INTERPOLATION_MODE;

STRING_SEGMENT_START: '}' -> more, popMode;
I_VAR: (UPPERCASE|LOWERCASE) ANY_CHAR* -> type(VAR);
I_DOT: '.' -> type(DOT);

...

variable: (VAR DOT)? VAR;


回答2:

Ok, I've worked out (inspired by this) that I need to define the default lexer rules again in the INTERPOLATION_MODE:

MyLexer.g4:

...
mode INTERPOLATION_MODE;

STRING_SEGMENT_START: '}' -> more, popMode;
I_VAR: (UPPERCASE|LOWERCASE) ANY_CHAR*;
I_DOT: '.';

...

MyParser.g4:

...
variable: ((VAR|I_VAR) (DOT|I_DOT))? (VAR|I_VAR);
...

This seems overkill though, so still holding out for someone with a better answer.



回答3:

String interpolation also implemented in existing C# and PHP grammars in official ANTLR grammars repository.



标签: antlr antlr4