-->

How to resolve conflict between two choices starti

2019-08-14 17:28发布

问题:

I'm trying to write a compiler for some specific format of messages. My problem if I simplify it is:

< WORD : ([LETTER]){2,5}>
< ANOTHER_WORD : (<LETTER>|<DIGIT>){1,5}>
< SPECIAL_WORLD : "START">

void grammar():
{
}
{ 
 <WORD><ANOTHER_WORD>
| <SPECIAL_WORD><ANOTHER_WORD>
}

Here my special word is matched always as a WORD which is logical of course but since the conflict is at the beginning of the production I don't know how to resolve it. some help would be appreciated.

回答1:

Put the rule for START first. Like most lexical scanner generators, JavaCC uses the rule that the longest possible token match is selected, and then, if two or more patterns apply, the first of these is chosen.

As a result, you ANOTHER_WORD rule will only match if WORD doesn't, so that it will only match words of length 1 or which contain a digit.

It appears that you expect the parser state to affect how lexical tokens are recognized. That's not how lexical scanners work, in general, but you can implement a limited form of contextual scanning by using lexical states.