Constraining the existing Boost.Spirit real_parser

2019-01-12 08:20发布

问题:

I want to parse a float, but not allow NaN values, so I generate a policy which inherits from the default policy and create a real_parser with it:

// using boost::spirit::qi::{real_parser,real_policies,
//                           phrase_parse,double_,char_};

template <typename T>
struct no_nan_policy : real_policies<T>
{
    template <typename I, typename A>
    static bool
    parse_nan(I&, I const&, A&) {
          return false;
    }    
};

real_parser<double, no_nan_policy<double> > no_nan;

// then I can use no_nan to parse, as in the following grammar
bool ok = phrase_parse(first, last, 
   no_nan[ref(valA) = _1] >> char_('@') >> double_[ref(b) = _1],
space);

But now I also want to ensure that the overall length of the string parsed with no_nan does not exceed 4, i.e. "1.23" or ".123" or even "2.e6" or "inf" is ok, "3.2323" is not, nor is "nan". I can not do that in the parse_n/parse_frac_n section of the policy, which separately looks left/right of the dot and can not communicate (...cleanly), which they would have to since the overall length is relevant.

The idea then was to extend real_parser (in boost/spirit/home/qi/numeric/real.hpp) and wrap the parse method -- but this class has no methods. Next to real_parser is the any_real_parser struct which does have parse, but these two structs do not seem to interact in any obvious way.

Is there a way to easily inject my own parse(), do some pre-checks, and then call the real parse (return boost::spirit::qi::any_real_parser<T, RealPolicy>::parse(...)) which then adheres to the given policies? Writing a new parser would be a last-resort method, but I hope there is a better way.

(Using Boost 1.55, i.e. Spirit 2.5.2, with C++11)

回答1:

It seems I am so close, i.e. just a few changes to the double_ parser and I'd be done. This would probably be a lot more maintainable than adding a new grammar, since all the other parsing is done that way. – toting 7 hours ago

Even more maintainable would be to not write another parser at all.

You basically want to parse a floating point numbers (Spirit has got you covered) but apply some validations afterward. I'd do the validations in a semantic action:

raw [ double_ [_val = _1] ] [ _pass = !isnan_(_val) && px::size(_1)<=4 ]

That's it.

Explanations

Anatomy:

  • double_ [_val = _1] parses a double and assigns it to the exposed attribute as usual¹
  • raw [ parser ] matches the enclosed parser but exposes the raw source iterator range as an attribute
  • [ _pass = !isnan_(_val) && px::size(_1)<=4 ] - the business part!

    This semantic action attaches to the raw[] parser. Hence

    • _1 now refers to the raw iterator range that already parsed the double_
    • _val already contains the "cooked" value of a successful match of double_
    • _pass is a Spirit context flag that we can set to false to make parsing fail.

Now the only thing left is to tie it all together. Let's make a deferred version of ::isnan:

boost::phoenix::function<decltype(&::isnan)> isnan_(&::isnan);

We're good to go.

Test Program

Live On Coliru

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <cmath>
#include <iostream>

int main ()
{
    using It = std::string::const_iterator;

    auto my_fpnumber = [] { // TODO encapsulate in a grammar struct
        using namespace boost::spirit::qi;
        using boost::phoenix::size;

        static boost::phoenix::function<decltype(&::isnan)> isnan_(&::isnan);

        return rule<It, double()> (
                raw [ double_ [_val = _1] ] [ _pass = !isnan_(_val) && size(_1)<=4 ]
            );
    }();

    for (std::string const s: { "1.23", ".123", "2.e6", "inf", "3.2323", "nan" })
    {
        It f = s.begin(), l = s.end();

        double result;
        if (parse(f, l, my_fpnumber, result))
            std::cout << "Parse success:  '" << s << "' -> " << result << "\n";
        else
            std::cout << "Parse rejected: '" << s << "' at '" << std::string(f,l) << "'\n";
    }
}

Prints

Parse success:  '1.23' -> 1.23
Parse success:  '.123' -> 0.123
Parse success:  '2.e6' -> 2e+06
Parse success:  'inf' -> inf
Parse rejected: '3.2323' at '3.2323'
Parse rejected: 'nan' at 'nan'

¹ The assignment has to be done explicitly here because we use semantic actions and they normally suppress automatic attribute propagation