-->

JavaCC - parse a step of an XPATH expression

2019-09-05 21:38发布

问题:

I'm trying to write a JavaCC script for a (simple) XPath parser and I'm having problems with the part to parse individual steps.

My idea of the grammar is this:

Step ::= ( AxisName "::" )? NodeTest ( "[" Predicate "]" )*

I have transformed it into the following script snippet:

Step Step() :
{
    Token t;

    Step step;

    Axis axis;
    NodeTest nodeTest;
    Expression predicate;
}
{
    { axis = Axis.child; }

    (
        t = <IDENTIFIER>
        { axis = Axis.valueOf(t.image); }

        <COLON>
        <COLON>
    )?

    t = <IDENTIFIER>
    { nodeTest = new NodeNameTest(t.image); }

    { step = new Step(axis, nodeTest); }

    (       
        <OPEN_PAR>

        predicate = Expression()

        { step.addPredicate(predicate); }

        <CLOSE_PAR>
    )*

    { return step; }
}

This, however, doesn't work. Given the following expression:

p

it throws the following error:

Exception in thread "main" java.lang.IllegalArgumentException: No enum constant cz.dusanrychnovsky.generator.expression.Axis.p
    at java.lang.Enum.valueOf(Unknown Source)
    at cz.dusanrychnovsky.generator.expression.Axis.valueOf(Axis.java:3)
    at cz.dusanrychnovsky.generator.parser.XPathParser.Step(XPathParser.java:123)
    at cz.dusanrychnovsky.generator.parser.XPathParser.RelativeLocationPath(XPathParser.java:83)
    at cz.dusanrychnovsky.generator.parser.XPathParser.AbsoluteLocationPath(XPathParser.java:66)
    at cz.dusanrychnovsky.generator.parser.XPathParser.Start(XPathParser.java:23)
    at cz.dusanrychnovsky.generator.parser.XPathParser.parse(XPathParser.java:16)
    at cz.dusanrychnovsky.generator.Main.main(Main.java:24)

I believe that what happens is that the parser sees an identifier on the input so it takes the axis branch even though no colons will follow, which the parser cannot know at that time.

What is the best way to fix this? Should I somehow increase the lookahead value for the Step rule, and if that's the case, then how exactly would I do that? Or do I need to rewrite the rule somehow?

回答1:

Two choices:

(   LOOKAHEAD(3)
    t = <IDENTIFIER>
    { axis = Axis.valueOf(t.image); }

    <COLON>
    <COLON>
)?

or

(   LOOKAHEAD( <IDENTIFIER> <COLON> <COLON> )
    t = <IDENTIFIER>
    { axis = Axis.valueOf(t.image); }

    <COLON>
    <COLON>
)?


标签: xpath javacc