Parsing SQL Queries in C++ using Boost.Spirit

2020-05-27 18:02发布

问题:

I have created a database engine in which I can create and modify tables, and add them to a database. For parsing the SQL queries, I have implemented the Boost.Spirit library using EBNF form. I have the parser setup properly and it successfully parses every rule.

My problem is I now have no idea how to integrate the two. The Boost.Spirit parser only validates that input is correct, however I need it to actually do something. I looked up semantic actions but they don't seem to handle what I'm looking for.

For example, if I have a query such as:
new_table <- SELECT (id < 5) old_table;

I want it to validate the input using the rules, then call the function
Table Database::Select(Table t , Condition c){ ... }
and pass the tokens as arguments.

How do I go about integrating the parser?

回答1:

Note: I opted to invent a sample grammar here for demonstration purposes, since your question doesn't show yours. Using the approach recommended here, it should not be hard to code a function to execute your queries after parsing.

I would really suggest building a parse tree.

I would recommend attribute propagation in preference to semantic actions. See e.g.

  • Boost Spirit: "Semantic actions are evil"?

Attribute propagation rules are very flexible in Spirit. The default exposed attributes types are well documented right with each Parser's documentation

  • http://www.boost.org/doc/libs/1_48_0/libs/spirit/doc/html/spirit/qi/reference.html

E.g. -qi::char_ would result in boost::optional<char> and qi::double_ | qi::int_ would result in boost::variant<double, int>.

You will probably want to accumulate the parsed elements in a AST datatype of your own invention, e.g.:

struct SelectStatement
{
    std::vector<std::string> columns, fromtables; 
    std::string whereclause; // TODO model as a vector<WhereCondition> :) 

    friend std::ostream& operator<<(std::ostream& os, SelectStatement const& ss)
    {
        return os << "SELECT [" << ss.columns.size() << " columns] from [" << ss.fromtables.size() << " tables]\nWHERE " + ss.whereclause;
    }
};

You could adapt this to Spirits attribute propagation machinery by adapting the struct as a Fusion sequence:

BOOST_FUSION_ADAPT_STRUCT(SelectStatement, 
        (std::vector<std::string>, columns)
        (std::vector<std::string>, fromtables)
        (std::string, whereclause)
       )

Now you could parse the following rule into that type:

sqlident = lexeme [ alpha >> *alnum ]; // table or column name

columns  = no_case [ "select" ] >> (sqlident % ',');
tables   = no_case [ "from" ]   >> (sqlident % ',');

start    = columns >> tables 
    >> no_case [ "where" ]
    >> lexeme [ +(char_ - ';') ]
    >> ';';

You can see this code running live here: http://liveworkspace.org/code/0b525234dbce22cbd8becd69f84065c1

Full demo code:

// #define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>

namespace qi    = boost::spirit::qi;

struct SelectStatement
{
    std::vector<std::string> columns, fromtables; 
    std::string whereclause; // TODO model as a vector<WhereCondition> :) 

    friend std::ostream& operator<<(std::ostream& os, SelectStatement const& ss)
    {
        return os << "SELECT [" << ss.columns.size() << " columns] from [" << ss.fromtables.size() << " tables]\nWHERE " + ss.whereclause;
    }
};

BOOST_FUSION_ADAPT_STRUCT(SelectStatement, 
        (std::vector<std::string>, columns)
        (std::vector<std::string>, fromtables)
        (std::string, whereclause)
       )

template <typename It, typename Skipper = qi::space_type>
    struct parser : qi::grammar<It, SelectStatement(), Skipper>
{
    parser() : parser::base_type(start)
    {
        using namespace qi;

        sqlident = lexeme [ alpha >> *alnum ]; // table or column name

        columns  = no_case [ "select" ] >> (sqlident % ',');
        tables   = no_case [ "from" ]   >> (sqlident % ',');

        start    = columns >> tables 
            >> no_case [ "where" ]
            >> lexeme [ +(char_ - ';') ]
            >> ';';

        BOOST_SPIRIT_DEBUG_NODE(start);
        BOOST_SPIRIT_DEBUG_NODE(sqlident);
        BOOST_SPIRIT_DEBUG_NODE(columns);
        BOOST_SPIRIT_DEBUG_NODE(tables);
    }

  private:
    qi::rule<It, std::string()             , Skipper> sqlident;
    qi::rule<It, std::vector<std::string>(), Skipper> columns  , tables;
    qi::rule<It, SelectStatement()         , Skipper> start;
};

template <typename C, typename Skipper>
    bool doParse(const C& input, const Skipper& skipper)
{
    auto f(std::begin(input)), l(std::end(input));

    parser<decltype(f), Skipper> p;
    SelectStatement query;

    try
    {
        bool ok = qi::phrase_parse(f,l,p,skipper,query);
        if (ok)   
        {
            std::cout << "parse success\n";
            std::cout << "query: " << query << "\n";
        }
        else      std::cerr << "parse failed: '" << std::string(f,l) << "'\n";

        if (f!=l) std::cerr << "trailing unparsed: '" << std::string(f,l) << "'\n";
        return ok;
    } catch(const qi::expectation_failure<decltype(f)>& e)
    {
        std::string frag(e.first, e.last);
        std::cerr << e.what() << "'" << frag << "'\n";
    }

    return false;
}

int main()
{
    const std::string input = "select id, name, price from books, authors where books.author_id = authors.id;";
    bool ok = doParse(input, qi::space);

    return ok? 0 : 255;
}

Will print output:

parse success
query: SELECT [3 columns] from [2 tables]
WHERE books.author_id = authors.id