Fast multi-replacement into string

2019-06-27 11:14发布

I have a string like the following:

{A}jahshs{b}jwuw{c}wuqjwhaha{d}{e}{f}jsj{g}

And I need to replace every {x} with a different string. The problem comes because this process will be repeated around 1000 times/second, so I need a optimized/fast way to do it.

Any idea? Boost replace? Boost format? Etc..

1条回答
Lonely孤独者°
2楼-- · 2019-06-27 12:11
  1. preallocate all buffers

    ....

  2. profit

Oh, and don't spam. Sample code in 5 10 minutes.

Okay here goes: also Live On Coliru

#include <string>
#include <sstream>
#include <boost/utility/string_ref.hpp>

template <typename Range>
int expand(Range const& /*key*/)
{
    return rand()%42; // todo lookup value with key (be sure to stay lean here)
}

#include <iostream>
int main()
{
    static const std::string msg_template = "{A}jahshs{b}jwuw{c}wuqjwhaha{d}{e}{f}jsj{g}\n";

    std::ostringstream builder;
    builder.str().reserve(1024); // reserve ample room, not crucial since we reuse it anyways

    for (size_t iterations = 1ul << 14; iterations; --iterations)
    {
        builder.str("");
        std::ostreambuf_iterator<char> out(builder);

        for(auto f(msg_template.begin()), l(msg_template.end()); f != l;)
        {
            switch(*f)
            {
                case '{' : 
                    {
                        auto s = ++f;
                        size_t n = 0;

                        while (f!=l && *f != '}')
                            ++f, ++n;

                        // key is [s,f] now
                        builder << expand(boost::string_ref(&*s, n));

                        if (f!=l)
                            ++f; // skip '}'
                    }
                    break;
                default:
                    *out++ = *f++;
            }
        }
        // to make it slow, uncomment:
        // std::cout << builder.str();
    }
}

This runs in ~0.239s on my system. Thats about 68k expansions/second. Oops. In release build it does 4 million expansions/second. On Coliru it reaches almost 1 million expansions/second.

Room for improvement:

  • you can prevalidate the input
  • if you know the parameter keys are always 1 letter, then you can simply by replacing the string_ref with the char, and by not looping for the '}'.
  • you can precalculate the indicecs of arguments and jump ahead. The benefit here is not so certain (sequential memory access is very good on some processors and the naive approach may be faster)
查看更多
登录 后发表回答