I have defined a rule for an identifier: start with an alpha character, followed by any number of alpha-numeric characters. I have differing results when I parse directly into a std::string
versus an adapted struct containing a single std::string
.
If the attribute for my grammar is std::string
, Qi will correctly adapt the sequence of characters into it. But with the struct, only the first character is stored. I'm not quite sure why this is. (Note that it makes no difference if the struct is "truly" adapted, or if it was defined by Fusion inline.)
Here's a SSCCE, configurable to debug:
// Options:
//#define DEFINE_STRUCT_INLINE
//#define DEBUG_RULE
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/adapted/struct/define_struct_inline.hpp>
#include <boost/fusion/include/define_struct_inline.hpp>
#include <boost/fusion/adapted/struct/adapt_struct.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <iostream>
#include <string>
namespace qi = boost::spirit::qi;
#ifdef DEFINE_STRUCT_INLINE
namespace example
{
BOOST_FUSION_DEFINE_STRUCT_INLINE(
identifier_result,
(std::string, name)
)
}
#else
namespace example
{
struct identifier_result
{
std::string name;
};
}
BOOST_FUSION_ADAPT_STRUCT(
example::identifier_result,
(std::string, name)
)
#endif
namespace example
{
typedef std::string identifier_result_str;
template <typename Iterator, typename Result>
struct identifier_parser : qi::grammar<Iterator, Result()>
{
identifier_parser() :
identifier_parser::base_type(identifier, "identifier_parser")
{
identifier %=
qi::alpha >>
*qi::alnum
;
identifier.name("identifier");
#ifdef DEBUG_RULE
debug(identifier);
#endif
}
qi::rule<Iterator, Result()> identifier;
};
}
std::string strip(example::identifier_result identifier)
{
return identifier.name;
}
std::string strip(std::string str)
{
return str;
}
template <typename Result>
void test_parse(const std::string& input)
{
using namespace example;
auto&& first = input.cbegin();
auto&& last = input.cend();
auto&& parser = identifier_parser<std::string::const_iterator, Result>();
auto&& skipper = qi::space;
Result result;
qi::phrase_parse(first, last, parser, skipper, result);
std::cout << "Result of the parse is: \'"
<< strip(result) << "\'" << std::endl;
}
int main()
{
using namespace example;
test_parse<identifier_result>(" validId1 ");
test_parse<identifier_result>(" %error1% ");
test_parse<identifier_result_str>(" validId2 ");
test_parse<identifier_result_str>(" %error2% ");
}
The output is:
Result of the parse is: 'v'
Result of the parse is: ''
Result of the parse is: 'validId2'
Result of the parse is: ''
As expected, both error cases don't match. But in the first case, my struct only captures the first character. I'd like to keep the struct for organization purposes.
If I debug the node, I get this output:
<identifier>
<try>validId1 </try>
<success> </success>
<attributes>[[[v]]]</attributes>
</identifier>
[ ... ]
<identifier>
<try>validId2 </try>
<success> </success>
<attributes>[[v, a, l, i, d, I, d, 2]]</attributes>
</identifier>
So I can see the rule is consuming the entire identifier, it just isn't storing it correctly. The only "hint" I have at the difference is that the v
in the first case is nested within [[[.]]]
, while the correct case is only [[.]]
. But I don't know what to do with it. :)
Why does this behavior occur?
Just to get you going, you have to wrap your string in an extra rule.
I don't know the exact explanation, but what you want to do is parsing a string with a sequence of
char
parsers. Withstring
as attribute type qi is able to use the attribute as container to store several chars, with a struct it just doesn't know how to do this. Maybe it would help to give the struct container properties, but I've no experience here. And for just parsing a string that might be overkill.Just altering your parser helps here: