Rule variables in ANTLR4

2019-05-02 00:40发布

问题:

I'm trying to convert my grammar from v3 to v4 and having some trouble.

In v3 I have rules like this:

dataspec[DataLayout layout] returns [DataExtractor extractor]
    @init {
        DataExtractorBuilder builder = new DataExtractorBuilder(layout);
    }
    @after {
        extractor = builder.create();
    }
    : first=expr { builder.addAll(first); } (COMMA next=expr { builder.addAll(next); })* 
    ;

expr returns [List<ValueExtractor> ext]
    ...

However, with rules in v4 returning these custom context objects instead of what I explicitly told them to return, things are all messed up. What's the v4 way to do this?

回答1:

There are multiple cases here:

  • accessing passed-in variables (layout)
  • accessing the current rule's return value (extractor)
  • accessing local variables (first, next)

Passed-in variables and current rule's return value

When accessing passed-in variables or the return value of the current rule you simply need to prefix the name given in the rule definition with $.

  • layout becomes $layout
  • extractor becomes $extractor

Local Variables

Evidently what needs to be done is to reference the variables' member which is named according to the returns clause of the rule which returned the value.

For example, first is capturing the result from the expr rule, and expr names its return value ext, meaning that:

  • first becomes $first.ext
  • next becomes $next.ext

When to Use the $ form

Unlike in v3 where you could reference certain variables as regular java fields, using the $ form is necessary in all cases, including in actions, in the @init and @after blocks, and when passing variables to other rules.

Other traps

If you're capturing optional tokens in a local variable, you may run into null pointer exceptions now that you're referencing an attribute of that variable.

single_lname returns [String s]
    : p=LNAME_PREFIX? r=NAME { $p.text + toNameCase($r.text); }
;

You'll need to check whether $p is null, but most of the time this would result in a "missing attribute access" error. ANTLR4 makes a special exception so that you can check this, which only applies when used in an if condition (refactoring this to use the ternary operator, for example, will still result in the error).

single_lname returns [String s]
    : p=LNAME_PREFIX? r=NAME { 
        if ($p == null) {
            $s = toNameCase($r.text);
        } else {
            $s = $p.text + toNameCase($r.text);
        }
    }
;

The updated rule

Putting it all together, the dataspec rule becomes:

dataspec[DataLayout layout] returns [DataExtractor extractor]
    @init {
        DataExtractorBuilder builder = new DataExtractorBuilder($layout);
    }
    @after {
        $extractor = builder.create();
    }
    : first=expr { builder.addAll($first.ext); }
        (COMMA next=expr { builder.addAll($next.ext); })* 
    ;