I've been searching A LOT about this and I couldn't find anything useful that REALLY helps me build an AST. I already know that ANTLR4 doesn't build AST like ANTLR3 used to do. Everyone say: "Hey, use visitors!", but I couldn't find any example or more detailed explanation on HOW can I do this...
I have a grammar must like C, but with every commands written in Portuguese (portuga programming language). I can easily generate the parse tree using ANTLR4. My question is: What I need to do now to create an AST?
BTW, I'm using Java and IntelliJ...
EDIT1: The closest I could get was using the answer of this topic: Is there a simple example of using antlr4 to create an AST from java source code and extract methods, variables and comments? But it only prints the name of the visited methods..
Since the first attempt didn't work for me as I expected, I tried to use this tutorial from ANTLR3, but I couldn't figure out how to use StringTamplate instead of ST...
Reading the book The Definitive ANTLR 4 Reference I also couldn't find anything related to ASTs.
EDIT2: Now I have one class to create the DOT file, I just need figure out on how to use visitors properly
I have created a small Java project that allows you to test your ANTLR grammar instantly by compiling the lexer and parser generated by ANTLR in-memory. You can just parse a string by passing it to the parser, and it will automatically generate an AST from it which can then be used in your application.
For the purpose of reducing the size of the AST, you could use a NodeFilter to which you could add the production-rule names of the non-terminals that you would like to be considered when constructing the AST.
The code and some code examples can be found at https://github.com/julianthome/inmemantlr
Hope the tool is useful ;-)
Ok, let's build a simple math example. Building an AST is totally overkill for such a task but it's a nice way to show the principle.
I'll do it in C# but the Java version would be very similar.
The grammar
First, let's write a very basic math grammar to work with:
Pretty basic stuff, we have a single
expr
rule that handles everything (precedence rules etc).The AST nodes
Then, let's define some AST nodes we'll use. These are totally custom and you can define them in the way you want to.
Here are the nodes we'll be using for this example:
Converting a CST to an AST
ANTLR generated the CST nodes for us (the
MathParser.*Context
classes). We now have to convert these to AST nodes.This is easily done with a visitor, and ANTLR provides us with a
MathBaseVisitor<T>
class, so let's work with that.As you can see, it's just a matter of creating an AST node out of a CST node by using a visitor. The code should be pretty self-explanatory (well, maybe except for the
VisitFuncExpr
stuff, but it's just a quick way to wire up a delegate to a suitable method of theSystem.Math
class).And here you have the AST building stuff. That's all that's needed. Just extract the relevant information from the CST and keep it in the AST.
The AST visitor
Now, let's play a bit with the AST. We'll have to build an AST visitor base class to traverse it. Let's just do something similar to the
AbstractParseTreeVisitor<T>
provided by ANTLR.Here, I took advantage of C#'s
dynamic
keyword to perform a double-dispatch in one line of code. In Java, you'll have to do the wiring yourself with a sequence ofif
statements like these:But I just went for the shortcut for this example.
Work with the AST
So, what can we do with a math expression tree? Evaluate it, of course! Let's implement an expression evaluator:
Pretty simple once we have an AST, isn't it?
Putting it all together
Last but not least, we have to actually write the main program:
And now we can finally play with it: