My day job includes working to develop a Pascal-like compiler. I've been working all along on optimizations and code generation.
I would also like to start learning to build a simple parser for the same language. I'm however, not really sure how to go about this. Flex and Bison seem to be the choice. But, isn't it possible to write a parser using C++ or C#? I'm a bit creepy with C.
Yacc++ supports C#, but it's a licensed one. I'm looking for all the help that I can find in this regard. Suggestions would be highly appreciated.
I've written an XSLT parser with flex and bison. More lately I'm doing a project using ANTLR, though:
is JFig language syntax efficient and clear (and better than Spring-Framework’s XML DSL)?
I've liked working in ANTLR much more so than Flex and Bison. ANTLR puts you up at a higher level of abstraction in some respects. The lexical definitions and parser grammar can all go in one file. (ANTLR will generate the token file.)
One of the key items is the ability to define tree grammars. Basically you do a grammar parse over the input language and have actions that rewrite to a highly optimal AST tree output (which remain as linked data structures in memory). You then can pass this tree to another parser defined in a separate tree parser file. The tree parser is where you do the real work of the action items you want.
This is a nice approach as you can keep the AST form and repeatedly reprocess it as needed - peeling off specific subtree nodes to process against based on latter actions, etc. Think of something like a language interpretor. Instead of going into a for loop and processing the language from the ground up over and over again, can just process through it's AST representation.
In my case I've devised a bean factory for doing IoC dependency injection. My bean factory keeps the AST of a bean descriptor around at runtime. When it needs to make (or retrieve) a new bean instance, I just pass the bean descriptor AST subtree to my tree parser - the outcome is the desired bean instance (there are a lot of factors that go in to determining how to instantiate the requested bean, including making any other beans that are referenced and/or applying other special behaviors via meta attributes).
Finally, my current bean factory is targeting Java, but I want to target ActionScript3 and C# .NET. ANTLR has support for doing that.
As mentioned, Terrence Parr has written a book on how to use ANTLR. It is aimed at working programmers that need to do something practical with ANTLR (as opposed to an academic treatment of the subject).
In his classic programming text, Algorithms + Data Structures = Programs, Niklaus Wirth develops an entire recursive descent parser (in Pascal) for a simple Pascal0-like language.
If your wanting C# as per this Question try Gardens Point GPPG and GPLEX.