I need to parse a rather simple stack based language, e.g.
1 2 add
3 1 sub
and I'm facing two choices here:
- Write my own lexer for the tokens and then proceed parsing it
- Use boost spirit
I have never used boost spirit but from what I've read around (documentation and examples) I can't still make up my mind on whether it would be overkill to use boost spirit to lex and parse this simple language or if it would make sense to use it instead of rolling out my own lexer and parser (thing that I suppose shouldn't be too hard).
Would using boost spirit for a simple stack based language like the one above pay off (since I need to learn it first before I can use it)?
We've seen an pure standard library approach.
This
execute
d the instructions immediately.Let's create a parser that build an AST (Abstract Syntax Tree). In the case of our simple stack machine it's just a list of instructions. Let's call it a
Tape
.Using Boost Spirit
Here's what using a Spirit grammar should look like:
Now, the AST types are:
Simple, right.
The grammar
In X3, making the grammar is pretty lightweight. Top-down:
Now, all we have to do is teach it to recognize opcodes. A start would be:
The seasoned Spirit guru will spot a problem here:
opcode_
is not a lexeme, and neither does it guarantee "distinct identifier" parsing. E.g."a dd"
would matchAdd
. And"additional"
would also match.Luckily, X3 makes it really easy to compose directives on the fly:
So, now both holes are fixed.
Full Demo
Live On Coliru
Prints:
Summary
The main gain here is:
we get backtracking for free - so no need to separate tokens ahead of time
We don't have any messy
switch
anymore. In fact it's hidden inside the Variant Visitor (seeapply_visitor
).The instruction execution is virtually unmodified, but we renamed
execute
tooperator()
so it models the Visitor concept.In the category "exhaustive explorations", let me add some "on the fly interpreting" stack machines using Spirit Qi (v2.x) and X3
In Spirit Qi
Here the semantic actions have to be "composed" using Phoenix actors:
Live On Coliru
Printing
Notes:
sub
anddiv
is not easy. It's gonna require some advanced Phoenix-fu (http://www.boost.org/doc/libs/1_59_0/libs/phoenix/doc/html/phoenix/modules/scope/let.html)In Spirit X3
The idea is the same, but we can use proper functional composition using lambdas.
We even use a helper to dynamically generate the parser expression along with suitable
binop
:Live On Coliru
Of course it prints the same output.
First approach can be devillishly simple in plain c++:
This exploits the fact that reading white-space separated strings is good enough as a "lexer" (tokenizer).
Now, implement
process
in the simplest way possible:Adding some debug tracing, we get the following output for the program "1 2 add 3 1 sub mul":
Live On Coliru
Using Boost Spirit
I've added it as a separate answer