-->

Two basic ANTLR questions

2020-07-13 10:00发布

问题:

I'm trying to use ANTLR to take a simple grammar and produce assembly output. My language of choice in ANTLR is Python.

Many tutorials seem very complicated or elaborate on things that aren't relevant to me; I only really need some very simple functionality. So I have two questions:

'Returning' values from one rule to another.

So let's say I have a rule like:

assignment: name=IDENTIFIER ASSIGNMENT expression;

I can run Python code in {}s when this rule is recognised, and I can pass args to the Python code for expression by doing something like:

assignment: name=IDENTIFIER ASSIGNMENT expression[variablesList];

and then

expression[variablesList]: blah blah

But how do I 'return' a value to my original rule? E.g. how do I calculate the value of the expression and then send it back to my assignment rule to use in Python there?

How do I write out my target language code?

So I have some Python which runs when the rules are recognised, then I calculate the assembly I want that statement to produce. But how do I say "write out this string of assembly instructions to my target file"?

Any good tutorials that are relevant to this kind of stuff (attribute grammars, compiling to something other than an AST, etc.) would be helpful too. If my questions don't make too much sense, please ask me to clarify; I'm having a hard time wrapping my head around ANTLR.

回答1:



Returning values from one rule to another

Let's say you want to parse simple expressions and provide a map of variables at runtime that can be used in these expressions. A simple grammar including the custom Python code, returns statements from the rules, and the parameter vars to the entry point of your grammar could look like this:

grammar T;

options {
  language=Python;
}

@members {
  variables = {}
}

parse_with [vars] returns [value]
@init{self.variables = vars}
  :  expression EOF                            {value = $expression.value}
  ;

expression returns [value]
  :  addition                                  {value = $addition.value}
  ;

addition returns [value]
  :  e1=multiplication                         {value = $e1.value}
                       ( '+' e2=multiplication {value = value + $e2.value}
                       | '-' e2=multiplication {value = value - $e2.value}
                       )*
  ;

multiplication returns [value]
  :  e1=unary                                  {value = $e1.value}
              ( '*' e2=unary                   {value = value * $e2.value}
              | '/' e2=unary                   {value = value / $e2.value}
              )*
  ;

unary returns [value]
  :  '-' atom                                  {value = -1 * $atom.value}
  |  atom                                      {value = $atom.value}
  ;

atom returns [value]
  :  Number                                    {value = float($Number.text)}
  |  ID                                        {value = self.variables[$ID.text]}
  |  '(' expression ')'                        {value = $expression.value}
  ;

Number : '0'..'9'+ ('.' '0'..'9'+)?;
ID     : ('a'..'z' | 'A'..'Z')+;
Space  : ' ' {$channel=HIDDEN};

If you now generate a parser using ANTLR v3.1.3 (no later version!):

java -cp antlr-3.1.3.jar org.antlr.Tool T.g

and run the script:

#!/usr/bin/env python
import antlr3
from antlr3 import *
from TLexer import *
from TParser import *

input = 'a + (1.0 + 2) * 3'
lexer = TLexer(antlr3.ANTLRStringStream(input))
parser = TParser(antlr3.CommonTokenStream(lexer))
print '{0} = {1}'.format(input, parser.parse_with({'a':42}))

you will see the following output being printed:

a + (1.0 + 2) * 3 = 51.0

Note that you can define more than a single "return" type:

parse
  :  foo              {print 'a={0} b={1} c={2}'.format($foo.a, $foo.b, $foo.c)}
  ;

foo returns [a, b, c]
  :  A B C            {a=$A.text; b=$B.text; b=$C.text}
  ;



How to write out a target language code

The easiest to go about this is to simply put print statements inside the custom code blocks and pipe the output to a file:

parse_with [vars]
@init{self.variables = vars}
  :  expression EOF                            {print 'OUT:', $expression.value}
  ;

and then run the script like this:

./run.py > out.txt

which will create a file 'out.txt' containing: OUT: 51.0. If your grammar isn't that big, you might get away with this. However, this might become a bit messy, in which case you could set the output of your parser to template:

options {
  output=template;
  language=Python;
}

and emit custom code through your own defined templates.

See:

  • StringTemplate: 5 minute Introduction
  • Where to get Python ANTLR package to use StringTemplate?