I want to parse a boolean expression (in C++). Input form:
a and b xor (c and d or a and b);
I just want to parse this expression into a tree, knowing the precedence rule (not,and,xor,or). So the above expression should look something like:
(a and b) xor ((c and d) or (a and b));
to the parser.
And the tree would be of form:
a
and
b
or
c
and
d
xor
a
and
b
The input will be either through command line or in the form of a string. I just need the parser.
Are there any sources that can help me do this?
See my SO answer on how to code simple recursive descent parsers.
This approach is very convenient for simple languages such as Boolean expressions. And the concepts are pretty much independent of your programming language.
Here is an implementation based on Boost Spirit.
Because Boost Spirit generates recursive descent parsers based on expression templates, honouring the 'idiosyncratic' (sic) precedence rules (as mentioned by others) is quite tedious. Therefore the grammar lacks a certain elegance.
Abstract Data Type
I defined a tree data structure using Boost Variant's recursive variant support, note the definition of expr:
(full source below)
Grammar Rules
The following is the (slightly tedious) grammar definition, as mentioned.
Although I don't consider this grammar optimal, it is quite readable, and we have ourselves a statically compiled parser with strongly typed AST datatype in roughly 50 lines of code. Things could be considerably worse.
Operating on the syntax tree
Obviously, you'd want to evaluate the expressions. For now, I decided to stop at just printing, so I don't have to do the lookup table for named variables :)
Traversing a recursive variant may look cryptic at first, but the
boost::static_visitor<>
is surprisingly simple once you get the hang of it:Test output:
For the test cases in the code, the following is output, demonstrating correct handling of the precedence rules by adding (redundant) parentheses:
Full Code:
Bonus:
For bonus points, to get a tree exactly like shown in the OP:
result:
Either use a parser generator as Oli Charlesworth already mentioned (yacc, bison, antlr; the latter is in my experience better suited for C++ than the other two although it is a while that I looked at any of them) or create a simple recursive descent parser: for a language as simple as yours this may be the easier approach.
If, like me, you find the overhead and idiosyncrasies of the parsing libraries too much for such a little job, you can very easily write your own parser for a simple scenario like the one you present. See here for a parser I wrote in C# to parse simple C# expressions along similar lines to your requirements.
Have a look at the the Mini C example code https://github.com/boostorg/spirit/tree/master/example/qi/compiler_tutorial/mini_c.
Especially have a look at the expression.cpp, expression.hpp, expression_def.hpp, and ast.hpp. It gives a great example on how to parse expressions into an AST.