I have a grammar file BoardFile.g4 that has (relevant parts only):
grammar Board;
//Tokens
GADGET : 'squareBumper' | 'circleBumper' | 'triangleBumper' | 'leftFlipper' | 'rightFlipper' | 'absorber' | 'portal' ;
NAME : [A-Za-z_][A-Za-z_0-9]* ;
INT : [0-9]+ ;
FLOAT : '-'?[0-9]+('.'[0-9]+)? ;
COMMENT : '#' ~( '\r' | '\n' )*;
WHITESPACE : [ \t\r\n]+ -> skip ;
KEY : [a-z] | [0-9] | 'shift' | 'ctrl' | 'alt' | 'meta' | 'space' | 'left' | 'right' | 'up' | 'down' | 'minus' | 'equals' | 'backspace' | 'openbracket' | 'closebracket' | 'backslash' | 'semicolon' | 'quote' | 'enter' | 'comma' | 'period' | 'slash' ;
KEYPRESS : 'keyup' | 'keydown' ;
//Rules
file : define+ EOF ;
define : board | ball | gadget | fire | COMMENT | key ;
board : 'board' 'name' '=' name ('gravity' '=' gravity)? ('friction1' '=' friction1)? ('friction2' '=' friction2)? ;
ball : 'ball' 'name' '=' name 'x' '=' xfloat 'y' '=' yfloat 'xVelocity' '=' xvel 'yVelocity' '=' yvel ;
gadget : gadgettype 'name' '=' name 'x' '=' xint 'y' '=' yint ('width' '=' width 'height' '=' height)? ('orientation' '=' orientation)? ('otherBoard' '=' name 'otherPortal' '=' name)? ;
fire : 'fire' 'trigger' '=' trigger 'action' '=' action ;
key : keytype 'key' '=' KEY 'action' '=' name ;
name : NAME ;
gadgettype : GADGET ;
keytype : KEYPRESS ;
gravity : FLOAT ;
friction1 : FLOAT ;
friction2 : FLOAT ;
trigger : NAME ;
action : NAME ;
yfloat : FLOAT ;
xfloat : FLOAT ;
yint : INT ;
xint : INT ;
xvel : FLOAT ;
yvel : FLOAT ;
orientation : INT ;
width : INT ;
height : INT ;
This generates the lexer and parser fine. However, when I use it against the following file, it gives the following error:
line 12:0 extraneous input 'keyup' expecting {<EOF>, KEYPRESS}
File to Parse:
board name=keysBoard gravity=5.0 friction1=0.0 friction2=0.0
# define a ball
ball name=Ball x=0.5 y=0.5 xVelocity=2.5 yVelocity=2.5
# add some flippers
leftFlipper name=FlipL1 x=16 y=2 orientation=0
leftFlipper name=FlipL2 x=16 y=9 orientation=0
# add keys. lots of keys.
keyup key=space action=apple
keydown key=a action=ball
keyup key=backslash action=cat
keydown key=period action=dog
I went through other questions about this error in SO but none helped me. I cannot figure out what's going wrong. Why am I getting this error?
The string
"keyup"
is being tokenized as aNAME
token: that is the problem.You must realize that the lexer operates independently from the parser. If the parser is trying to match a
KEYPRESS
token, the lexer does not "listen" to it, but just constructs a token following the rules:Taking these rules into account, and the order of your rules:
a
NAME
token will be created before most of theKEY
alternatives, and all of theKEYPRESS
alternatives will be created.And since an
INT
matches one or more digits and is defined beforeKEY
which also has a single digit alternative, it is clear that the lexer will never produce aKEY
orKEYPRESS
token.If you move the
NAME
andINT
rule below theKEY
andKEYPRESS
rules, then most of the tokens will be constructed as you expect, is my guess.EDIT
A possible solution would look like:
I.e. I removed the
[0-9]
alternative fromKEY
and introduced aSINGLE_DIGIT
rule (which is placed before theINT
rule!).Now create some extra parser rules:
and change all occurrences of
INT
inside parser rules tointeger
(don't call your ruleint
: it is a reserved word) and change allKEY
tokey
.And you might also want to do something similar to
NAME
and the[a-z]
alternative inKEY
(i.e. a single lowercase char would now never be tokenized as aNAME
, always as aKEY
).