mismatched Input when lexing and parsing with mode

2019-03-03 09:45发布

问题:

I'm having an ANTLR4 problem with mismatched input but can't solve it. I've found a lot of questions dealing with that, and the usually revolve around the lexer matching something else to the token, but I don't see it in my case.

I've got this lexer grammar:

FieldStart              :   '[' Definition ']'          ->  pushMode(INFIELD)   ;
Definition              :   'Element';
mode INFIELD;
FieldEnd                :   '[end]'                     ->  popMode             ;
ContentValue            :   ~[[]*                                               ;

Which then runs on the following parser:

field           :   FieldStart  ContentValue FieldEnd               #Field_Found;

I simplified it to zoom in to the problem, but here's the point where I can't get any further.

I'm running that on the following input:

[Element]Va-lu*e[end]

and I get this output:

Type : 001 | FieldStart | [Element]
Type : 004 | ContentValue | Va-lu*e
Type : 003 | FieldEnd | [end]
Type : -001 | EOF | <EOF>


([] [Element] Va-lu*e [end])

I generated the output with C#, doing the following (shortened):

            string tokens = "";
            foreach (IToken CurrToken in TokenStream.GetTokens())
            {
                if (CurrToken.Type == -1)
                {
                    tokens += "Type : " + CurrToken.Type.ToString("000") + " | " + "EOF" + " | " + CurrToken.Text + "\n";
                }
                else
                {
                    tokens += "Type : " + CurrToken.Type.ToString("000") + " | " + Lexer.RuleNames[CurrToken.Type - 1] + " | " + CurrToken.Text + "\n";
                }
            }
            tokens += "\n\n" + ParseTree.ToStringTree();

Upon parsing this via

IParseTree ParseTree = Parser.field();

I am presented this error:

"mismatched input 'Va-lu*e' expecting ContentValue"

I just don't find the error, can you help me here? I assume it's got something to do with the lexer mode, but from as far as I read it looks like the parser doesn't care (or know) about the modes.

Thanks!

回答1:

Modes are not available in a combined grammar. Split your grammar and it should work.

Also, always check the error messages:

error(120): ../Field.g4:14:5: lexical modes are only allowed in lexer grammars



回答2:

I think I have now figured out how to solve my problem, there seems to be a required configuration when working with a split Lexer / Parser grammar structure AND using Lexer modes in Visual Studio (tested 2012 and 2013) with the ANTRL4 NuGet release:

I had to include

options {	tokenVocab = GRAMMAR_NAME_Lexer;	}

in my parser grammar at the beginning.

Otherwise, the lexer did create the tokens and the modes as expected but the parser will not recognize lexer tokens that are in another mode but the default mode.

I have also experienced that the "popMode" lexer command does sometimes cause my TokenStream to throw an invalid state exception, I could solve that with using "mode(DEFAULT_MODE)" instead of "popMode".

I hope this helps somebody, but I'd still like if someone who understands ANTLR could offer some additional clarification, since I just "solved" it by toying around until it worked.



标签: c# antlr4