I'm trying to get started with ANTLR and C# but I'm finding it extraordinarily difficult due to the lack of documentation/tutorials. I've found a couple half-hearted tutorials for older versions, but it seems there have been some major changes to the API since.
Can anyone give me a simple example of how to create a grammar and use it in a short program?
I've finally managed to get my grammar file compiling into a lexer and parser, and I can get those compiled and running in Visual Studio (after having to recompile the ANTLR source because the C# binaries seem to be out of date too! -- not to mention the source doesn't compile without some fixes), but I still have no idea what to do with my parser/lexer classes. Supposedly it can produce an AST given some input...and then I should be able to do something fancy with that.
There is a great article on how to use antlr and C# together here:
http://www.codeproject.com/KB/recipes/sota_expression_evaluator.aspx
it's a "how it was done" article by the creator of NCalc which is a mathematical expression evaluator for C# - http://ncalc.codeplex.com
You can also download the grammar for NCalc here: http://ncalc.codeplex.com/SourceControl/changeset/view/914d819f2865#Grammar%2fNCalc.g
example of how NCalc works:
hope its helpful
Let's say you want to parse simple expressions consisting of the following tokens:
-
subtraction (also unary);+
addition;*
multiplication;/
division;(...)
grouping (sub) expressions;An ANTLR grammar could look like this:
Now to create a proper AST, you add
output=AST;
in youroptions { ... }
section, and you mix some "tree operators" in your grammar defining which tokens should be the root of a tree. There are two ways to do this:^
and!
after your tokens. The^
causes the token to become a root and the!
excludes the token from the ast;... -> ^(Root Child Child ...)
.Take the rule
foo
for example:and let's say you want
TokenB
to become the root andTokenA
andTokenC
to become its children, and you want to excludeTokenD
from the tree. Here's how to do that using option 1:and here's how to do that using option 2:
So, here's the grammar with the tree operators in it:
I also added a
Space
rule to ignore any white spaces in the source file and added some extra tokens and namespaces for the lexer and parser. Note that the order is important (options { ... }
first, thentokens { ... }
and finally the@... {}
-namespace declarations).That's it.
Now generate a lexer and parser from your grammar file:
and put the
.cs
files in your project together with the C# runtime DLL's.You can test it using the following class:
which produces the following output:
which corresponds to the following AST:
(diagram created using graph.gafol.net)
Note that ANTLR 3.3 has just been released and the CSharp target is "in beta". That's why I used ANTLR 3.2 in my example.
In case of rather simple languages (like my example above), you could also evaluate the result on the fly without creating an AST. You can do that by embedding plain C# code inside your grammar file, and letting your parser rules return a specific value.
Here's an example:
which can be tested with the class:
and produces the following output:
EDIT
My personal experience is that before learning ANTLR on C#/.NET, you should spare enough time to learn ANTLR on Java. That gives you knowledge on all the building blocks and later you can apply on C#/.NET.
I wrote a few blog posts recently,
The assumption is that you are familiar with ANTLR on Java and is ready to migrate your grammar file to C#/.NET.
Have you looked at Irony.net? It's aimed at .Net and therefore works really well, has proper tooling, proper examples and just works. The only problem is that it is still a bit 'alpha-ish' so documentation and versions seem to change a bit, but if you just stick with a version, you can do nifty things.
p.s. sorry for the bad answer where you ask a problem about X and someone suggests something different using Y ;^)