parsing into several vector members

2019-01-11 21:22发布

问题:

I want to recursively parse a string and store the results in one struct. I've written a parser that can handle one iteration. The input is formatted as follows:

v  1.5 2.0 2.5
v  3.0 3.5 4.0
f 1 2 3
f 4 5 6 
v  4.5 5.0 5.5
v  6.0 6.5 7.0
f 7 8 9
f 10 11 12

The problem is that it only parses the first 4 lines, it stops at the third encountered 'v'. The complete code is given below. How do I modify this code so it also parses the rest of the input into the same struct? I've tried modifying the start rule from start = vertex >> elements to start = *(vertex >> elements), but that only gives a huge compilation error. Same goes for start = +(vertex >> elements). Any ideas how I should modify the rules?

#include <iostream>
#include <sstream>
#include <fstream>

#include "boost/spirit/include/qi.hpp"
#include "boost/spirit/include/support_iso8859_1.hpp"
#include "boost/fusion/include/adapt_struct.hpp"


struct ElemParseData
{
    std::vector<float> verts;
    std::vector<unsigned int> idx;
};

BOOST_FUSION_ADAPT_STRUCT(
    ElemParseData,
    (std::vector<float>, verts)
    (std::vector<unsigned int>, idx)
)


bool doParse( ElemParseData &parseData, const std::string &data )
{
    namespace qi      = boost::spirit::qi;
    namespace iso8859 = boost::spirit::iso8859_1;

    struct objGram : qi::grammar<std::string::const_iterator, ElemParseData(), iso8859::space_type>
    {
        objGram() : objGram::base_type(start)
        {
            vertex   = *('v' >> qi::double_ >> qi::double_ >> qi::double_);
            elements = *('f' >> qi::int_ >> qi::int_ >> qi::int_);

            start = vertex >> elements;
        }

        qi::rule<std::string::const_iterator, ElemParseData(), iso8859::space_type> start;
        qi::rule<std::string::const_iterator, std::vector<float>(), iso8859::space_type> vertex;
        qi::rule<std::string::const_iterator, std::vector<unsigned int>(), iso8859::space_type> elements;

    } objGrammar;

    std::string::const_iterator f = data.cbegin();
    bool res = qi::phrase_parse( f, data.cend(), objGrammar, iso8859::space, parseData );


    // print everything that hasn't been processed by the parser
    std::cout << "#### Trail ####" << std::endl;
    std::cout << std::string(f, data.cend()) << std::endl;

    return res;
}


int main( int argc, char* argv[] )
{
    std::stringstream ss;
    std::filebuf fb;
    if ( fb.open("parsetest.txt", std::ios::in) )
    {
        std::istream is(&fb);
        while (is)
            ss << char(is.get());
        fb.close();
    }


    ElemParseData parseData;
    bool res = doParse( parseData, ss.str() );


    // print results
    std::cout << std::endl << "Parsing result: " << res << std::endl;
    std::cout << "---######### ResultData #########---" << std::endl;
    std::cout << "---- Begin vertex data ----" << std::endl;
    std::vector<float>::iterator it;
    for ( it = parseData.verts.begin(); it != parseData.verts.end(); ++it )
        std::cout << *it << std::endl;
    std::cout << "---- End vertex data ----" << std::endl;

    std::cout << std::endl;

    std::cout << "---- Begin index data ----" << std::endl;
    std::vector<unsigned int>::iterator idxIt;
    for ( idxIt = parseData.idx.begin(); idxIt != parseData.idx.end(); ++idxIt )
            std::cout << *idxIt << std::endl;
    std::cout << "---- End index data ----" << std::endl;

    std::cout << "Press enter to exit" << std::endl;
    std::cin.get();
}

P.S.: If needed, the compilation error can be found here.

EDIT: I'm trying to write an Wavefront .OBJ parser. The input given here is just a simplification of my problem.

回答1:

There are several ways :)

  1. Custom attribute traits
  2. The same using semantic actions
  3. Everything in semantic actions, at detail level

1. Custom attribute traits

The cleanest, IMO would to replace the Fusion Sequence Adaptation (BOOST_FUSION_ADAPT_STRUCT) by custom container attribute traits for Spirit:

namespace boost { namespace spirit { namespace traits {

    template<> 
        struct is_container<ElemParseData, void> : mpl::true_ { };
    template<> 
        struct container_value<ElemParseData, void> { 
             typedef boost::variant<float, unsigned int> type;
        };
    template <>
        struct push_back_container<ElemParseData, std::vector<float>, void> {
            static bool call(ElemParseData& c, std::vector<float> const& val) {
                c.verts.insert(c.verts.end(), val.begin(), val.end());
                return true;
            }
        };
    template <>
        struct push_back_container<ElemParseData, std::vector<unsigned int>, void> {
            static bool call(ElemParseData& c, std::vector<unsigned int> const& val) {
                c.idx.insert(c.idx.end(), val.begin(), val.end());
                return true;
            }
        };
}}}

Without changes to the grammar, this will simply result in the same effect. However, now you can modify the parser to expect the desired grammar:

    vertex   = 'v' >> qi::double_ >> qi::double_ >> qi::double_;
    elements = 'f' >> qi::int_ >> qi::int_ >> qi::int_;

    start = *(vertex | elements);

And because of the traits, Spirit will "just know" how to insert into ElemParseData. See it live on Coliru

2. The same using semantic actions

You can wire it up in semantic actions:

    start = *(  
               vertex   [phx::bind(insert, _val, _1)] 
             | elements [phx::bind(insert, _val, _1)]
             );

With insert a member of type inserter:

struct inserter {
    template <typename,typename> struct result { typedef void type; };

    template <typename Attr, typename Vec>
        void operator()(Attr& attr, Vec const& v) const { dispatch(attr, v); }
    private:
    static void dispatch(ElemParseData& data, std::vector<float> vertices) {
        data.verts.insert(data.verts.end(), vertices.begin(), vertices.end());
    }
    static void dispatch(ElemParseData& data, std::vector<unsigned int> indices) {
        data.idx.insert(data.idx.end(), indices.begin(), indices.end());
    }
};

This looks largely the same, and it does the same: live on Coliru

3. Everything in semantic actions, at detail level

This is the only solution that doesn't require any kind of plumbing, except perhaps inclusion of boost/spirit/include/phoenix.hpp:

struct objGram : qi::grammar<std::string::const_iterator, ElemParseData(), iso8859::space_type>
{
    objGram() : objGram::base_type(start)
    {
        using namespace qi;

        auto add_vertex = phx::push_back(phx::bind(&ElemParseData::verts, _r1), _1);
        auto add_index  = phx::push_back(phx::bind(&ElemParseData::idx,   _r1), _1);
        vertex   = 'v' >> double_ [add_vertex] >> double_ [add_vertex] >> double_ [add_vertex];
        elements = 'f' >> int_    [add_index]  >> int_    [add_index]  >> int_    [add_index] ;

        start = *(vertex(_val) | elements(_val));
    }

    qi::rule<std::string::const_iterator, ElemParseData(), iso8859::space_type> start;
    qi::rule<std::string::const_iterator, void(ElemParseData&), iso8859::space_type> vertex, elements;
} objGrammar;

Note:

  • One slight advantage here would be that there is less copying of values
  • A disadvantage is that you lose 'atomicity' (if a line fails to parse after, say, the second value, the first two values will have been pushed into the ElemParseData members irrevocably).

Side note

There is a bug in the read loop, prefer the simpler options:

std::filebuf fb;
if (fb.open("parsetest.txt", std::ios::in))
{
    ss << &fb;
    fb.close();
}

Or consider boost::spirit::istream_iterator