可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am parsing key value pairs (similar to HTTP headers) using boost::spirit::x3. When comparing the performance to my handwritten parser, boost::spirit::x3 is around 10% slower than that.

I am using boost 1.61 and GCC 6.1:

$ g++ -std=c++14 -O3 -I/tmp/boost_1_61_0/boost/ main.cpp  && ./a.out

phrase_parse 1.97432 microseconds
parseHeader 1.75742 microseconds

How can I improve the performance of the boost::spirit::x3 based parser?

#include <iostream>
#include <string>
#include <map>
#include <chrono>

#include <boost/spirit/home/x3.hpp>
#include <boost/fusion/adapted/std_pair.hpp>

using header_map = std::map<std::string, std::string>; 

namespace parser
{
    namespace x3 = boost::spirit::x3;
    using x3::char_;
    using x3::lexeme;

    x3::rule<class map, header_map> const map = "msg";

    const auto key     = +char_("0-9a-zA-Z-");
    const auto value   = +~char_("\r\n");

    const auto header =(key >> ':' >> value >> lexeme["\r\n"]);
    const auto map_def = *header >> lexeme["\r\n"];

    BOOST_SPIRIT_DEFINE(map);
}


template <typename It>
void parseHeader(It& iter, It end, header_map& map)
{
    std::string key;
    std::string value;

    It last = iter;
    bool inKey = true;
    while(iter+1 != end)
    {
        if(inKey && *(iter+1)==':')
        {
            key.assign(last, iter+1);
            iter+=3;
            last = iter;
            inKey = false;
        }
        else if (!inKey && *(iter+1)=='\r' && *(iter+2)=='\n')
        {
            value.assign(last, iter+1);
            map.insert({std::move(key), std::move(value)});
            iter+=3;
            last = iter;
            inKey = true;
        }
        else if (inKey && *(iter)=='\r' && *(iter+1)=='\n') 
        {
            iter+=2;
            break;
        }
        else
        {
            ++iter;
        }
    }
}

template<typename F, typename ...Args>
double benchmark(F func, Args&&... args)
{
    auto start = std::chrono::system_clock::now();

    constexpr auto num = 10 * 1000 * 1000;
    for (std::size_t i = 0; i < num; ++i)
    {
        func(std::forward<Args>(args)...);
    }

    auto end = std::chrono::system_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end - start);

    return duration.count() / (double)num;
}

int main()
{
    const std::size_t headerCount = 20;

    std::string str;
    for(std::size_t i = 0; i < headerCount; ++i)
    {
        std::string num = std::to_string(i);
        str.append("key" + num + ": " + "value" + num + "\r\n");
    }
    str.append("\r\n");

    double t1 = benchmark([&str]() {
        auto iter = str.cbegin();
        auto end = str.cend();

        header_map header;
        phrase_parse(iter, end, parser::map, boost::spirit::x3::ascii::blank, header);
        return header;
    });
    std::cout << "phrase_parse " << t1 << " microseconds"<< std::endl;

    double t2 = benchmark([&str]() {
        auto iter = str.cbegin();
        auto end = str.cend();

        header_map header;
        parseHeader(iter, end, header);
        return header;
    });
    std::cout << "parseHeader " << t2 << " microseconds"<< std::endl;
    return 0;
}

live example

回答1:

Here's a fixed x3 grammar that comes a lot closer to your hand rolled "parser":

const auto key     = +~char_(':');
const auto value   = *(char_ - "\r\n");

const auto header = key >> ':' >> value >> "\r\n";
const auto map    = *header >> "\r\n";

Of course, it's still more strict and more robust. Also, don't call it with a space skipper, since your hand-rolled parser doesn't do that either.

Here's the performance measurements on my box:

Statistics that's 2.5µs vs. 3.5µs on average.

Full Code

Using http://nonius.io for robust benchmarking:

#include <iostream>
#include <string>
#include <map>
#include <nonius/benchmark.h++>

#include <boost/spirit/home/x3.hpp>
#include <boost/fusion/adapted/std_pair.hpp>

using header_map = std::map<std::string, std::string>; 

namespace parser
{
    namespace x3 = boost::spirit::x3;
    using x3::char_;

    const auto key     = +~char_(':');
    const auto value   = *(char_ - "\r\n");

    const auto header = key >> ':' >> value >> "\r\n";
    const auto map    = *header >> "\r\n";
}


template <typename It>
void parseHeader(It& iter, It end, header_map& map)
{
    std::string key;
    std::string value;

    It last = iter;
    bool inKey = true;
    while(iter+1 != end)
    {
        if(inKey && *(iter+1)==':')
        {
            key.assign(last, iter+1);
            iter+=3;
            last = iter;
            inKey = false;
        }
        else if (!inKey && *(iter+1)=='\r' && *(iter+2)=='\n')
        {
            value.assign(last, iter+1);
            map.insert({std::move(key), std::move(value)});
            iter+=3;
            last = iter;
            inKey = true;
        }
        else if (inKey && *(iter)=='\r' && *(iter+1)=='\n') 
        {
            iter+=2;
            break;
        }
        else
        {
            ++iter;
        }
    }
}

static auto const str = [] {
    std::string tmp;
    const std::size_t headerCount = 20;
    for(std::size_t i = 0; i < headerCount; ++i)
    {
        std::string num = std::to_string(i);
        tmp.append("key" + num + ": " + "value" + num + "\r\n");
    }
    tmp.append("\r\n");
    return tmp;
}();

NONIUS_BENCHMARK("manual", [](nonius::chronometer cm) {

    cm.measure([]() {
        auto iter = str.cbegin();
        auto end = str.cend();

        header_map header;
        parseHeader(iter, end, header);
        assert(header.size() == 20);
        return header.size();
    });
})

NONIUS_BENCHMARK("x3", [](nonius::chronometer cm) {

    cm.measure([] {
        auto iter = str.cbegin();
        auto end = str.cend();

        header_map header;
        parse(iter, end, parser::map, header);
        assert(header.size() == 20);
        return header.size();
    });
})

#include <nonius/main.h++>

I'm using gcc 5.4 and Boost 1.61

回答2:

After having a first look over the custom parser it occurs to me that it is not as robust as the spirit parser.

If you change line 91 to remove the \r from the trailing "\r\n" you'll see what I mean.

Bad data can cause the hand-rolled one to segfault.

e.g.:

str.append("key" + num + ": " + "value" + num + "\n");

results in segfault on line 46.

So step one, I think, is to modify the hand-rolled parser so that it's checking boundary conditions in the same way that the spirit one is.

Although I don't expect the timings to converge completely, they will be closer.

how to improve performance of boost::spirit::x3 ke

问题:

回答1:

Full Code

回答2:

收藏的人(0)

how to improve performance of boost::spirit::x3 ke

问题:

回答1:

Full Code

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮