I wish to split a string on a single character or a string. I would like to use boost::split
since boost string is our standard for basic string handling (I don't wish to mix several techniques).
In the single character case I could do split(vec,str,is_any_of(':'))
but I'd like to know if there is a way to specify just a single character. It may improve performance, but more importantly I think the code would be clearer with just a single character, since is_any_of conveys a different meaning that what I want.
For matching against a string I don't know what syntax to use. I don't wish to to construct a regex; some simple syntax like split(vec,str,match_str("::")
would be good.
In the following code, let me assume using namespace boost
for brevity.
As for splitting on a character, if only algorithm/string
is allowed,
is_from_range
might serve the purpose:
split(vec,str, is_from_range(':',':'));
Alternatively, if lambda
is allowed:
split(vec,str, lambda::_1 == ':');
or if preparing a dedicated predicate is allowed:
struct match_char {
char c;
match_char(char c) : c(c) {}
bool operator()(char x) const { return x == c; }
};
split(vec,str, match_char(':'));
As for matching against a string, as David Rodri'guez mentioned,
there seems not to be the way with split
.
If iter_split
is allowed, probably the following code will meet the purpose:
iter_split(vec,str, first_finder("::"));
I was looking for the same answer but I couldn't find one. Finally I managed to produce one on my own.
You can use std::equal_to
to form the predicate you need. Here's an example:
boost::split(container, str, std::bind1st(std::equal_to<char>(), ','));
This is exactly how I do it when I need to split a string using a single character.
On the simple token, I would just leave is_any_of
as it is quite easy to understand what is_any_of( single_option )
means. If you really feel like changing it, the third element is a functor, so you could pass an equals
functor to the split
function.
That approach will not really work with multiple tokens, as the iteration is meant to be characater by character. I don't know the library enough to offer prebuilt alternatives, but you can implement the functionality on top of split_iterator
s