I was attempting to replicate this example in order to implement C++ like operator precedence rules (I started with a subset, but I eventually plan to add the others).
Try as I might, I could not get the grammar to parse a single binary operation. It would parse literals (44, 3.42, "stackoverflow") just fine, but would fail anything like 3 + 4
.
I did look at this question, and this one in an attempt to get my solution to work, but got the same result.
(In an attempt to keep things short, I'll post only the relevant bits here, the full code is here)
Relevant data structures for the AST:
enum class BinaryOperator
{
ADD, SUBTRACT, MULTIPLY, DIVIDE, MODULO, LEFT_SHIFT, RIGHT_SHIFT, EQUAL, NOT_EQUAL, LOWER, LOWER_EQUAL, GREATER, GREATER_EQUAL,
};
typedef boost::variant<double, int, std::string> Litteral;
struct Identifier { std::string name; };
typedef boost::variant<
Litteral,
Identifier,
boost::recursive_wrapper<UnaryOperation>,
boost::recursive_wrapper<BinaryOperation>,
boost::recursive_wrapper<FunctionCall>
> Expression;
struct BinaryOperation
{
Expression rhs, lhs;
BinaryOperator op;
BinaryOperation() {}
BinaryOperation(Expression rhs, BinaryOperator op, Expression lhs) : rhs(rhs), op(op), lhs(lhs) {}
};
The grammar:
template<typename Iterator, typename Skipper>
struct BoltGrammar : qi::grammar<Iterator, Skipper, Program()>
{
BoltGrammar() : BoltGrammar::base_type(start, "start")
{
equalOp.add("==", BinaryOperator::EQUAL)("!=", BinaryOperator::NOT_EQUAL);
equal %= (lowerGreater >> equalOp >> lowerGreater);
equal.name("equal");
lowerGreaterOp.add("<", BinaryOperator::LOWER)("<=", BinaryOperator::LOWER_EQUAL)(">", BinaryOperator::GREATER)(">=", BinaryOperator::GREATER_EQUAL);
lowerGreater %= (shift >> lowerGreaterOp >> shift);
lowerGreater.name("lower or greater");
shiftOp.add("<<", BinaryOperator::LEFT_SHIFT)(">>", BinaryOperator::RIGHT_SHIFT);
shift %= (addSub >> shiftOp >> addSub);
shift.name("shift");
addSubOp.add("+", BinaryOperator::ADD)("-", BinaryOperator::SUBTRACT);
addSub %= (multDivMod >> addSubOp >> multDivMod);
addSub.name("add or sub");
multDivModOp.add("*", BinaryOperator::MULTIPLY)("/", BinaryOperator::DIVIDE)("%", BinaryOperator::MODULO);
multDivMod %= (value >> multDivModOp >> value);
multDivMod.name("mult, div, or mod");
value %= identifier | litteral | ('(' > expression > ')');
value.name("value");
start %= qi::eps >> *(value >> qi::lit(';'));
start.name("start");
expression %= identifier | litteral | equal;
expression.name("expression");
identifier %= qi::lexeme[ascii::char_("a-zA-Z") >> *ascii::char_("0-9a-zA-Z")];
identifier.name("identifier");
litteral %= qi::double_ | qi::int_ | quotedString;
litteral.name("litteral");
quotedString %= qi::lexeme['"' >> +(ascii::char_ - '"') >> '"'];
quotedString.name("quoted string");
namespace phx = boost::phoenix;
using namespace qi::labels;
qi::on_error<qi::fail>(start, std::cout << phx::val("Error! Expecting: ") << _4 << phx::val(" here: \"") << phx::construct<std::string>(_3, _2) << phx::val("\"") << std::endl);
}
qi::symbols<char, BinaryOperator> equalOp, lowerGreaterOp, shiftOp, addSubOp, multDivModOp;
qi::rule<Iterator, Skipper, BinaryOperation()> equal, lowerGreater, shift, addSub, multDivMod;
qi::rule<Iterator, Skipper, Expression()> value;
qi::rule<Iterator, Skipper, Program()> start;
qi::rule<Iterator, Skipper, Expression()> expression;
qi::rule<Iterator, Skipper, Identifier()> identifier;
qi::rule<Iterator, Skipper, Litteral()> litteral;
qi::rule<Iterator, Skipper, std::string()> quotedString;
};
The main problem (indeed) appears to be addressed in that second answer you linked to.
Let me address some points:
the main problem was was compound:
your start rule is
this means it expects
value
s:however, since this parses only identifiers and literals or parenthesized subexpressions, the
3+4
binary operation will never be parsed.your expression rule, again allows
identifier
orliteral
first (redundant/confusing):I think you'd want something more like
your
BinaryOperation
productions allow only for the case where the operator is present; this breaks the way the rules are nested for operator precedence: amultDivOp
would never be accepted as match, unless it happens to be followed by anaddSubOp
:This can best be fixed as shown in the linked answer:
where you can use semantic actions to build the AST nodes "dynamically":
This beats the 'tedious" declarative approach hands-down (which leads to a lot of backtracking, see e.g. Boost spirit poor perfomance with Alternative parser)
the literal rule will parse an integer as a double, because it isn't strict:
you can fix this like:
FunctionCall
should adaptfunctionName
as anIdentifier
(notstd::string
)You
Expression
operator<<
could (should) be aboost::static_visitor
so that youUsing c++11, the code could still be inside the one function:
you can use the
BOOST_SPIRIT_DEBUG_NODES
macro to name your rulesyou should include from the
spirit/include/
directory, which then relays tospirit/home/
orphoenix/include/
instead of including them directly.Here is a fully working sample, that also improved the grammar rules for readability Live On Coliru:
Prints: