Somewhat my code looks like below:
static int myfunc(const string& stringInput)
{
string word;
stringstream ss;
ss << stringInput;
while(ss >> word)
{
++counters[word];
}
...
}
The purpose here is to get an input string (separated by white space ' ') into the string variable word
, but the code here seems to have a lot of overhead -- convert the input string to a string stream and read from the string stream into the target string.
Is there a more elegant way to accomplish the same purpose?
You are asking how to split a string. Boost has a helpful utility boost::split()
http://www.boost.org/doc/libs/1_48_0/doc/html/string_algo/usage.html#id3115768
Here's an example that puts the resulting words into a vector:
#include <boost/algorithm/string.hpp>
std::vector<std::string> strs;
boost::split(strs, "string to split", boost::is_any_of("\t "));
Code in c++
#include<sstream>
#include<vector>
using namespace std;
string diskNames="vbbc anmnsa mansdmns";
string temp;
vector <string> cds;
stringstream s (diskNames);
while(s>> temp)
cds.push_back(temp);
Use stream iterators and a standard function:
static int myfunc(std::string const& stringInput)
{
std::stringstream ss(stringInput);
std::for_each(std::istream_iterator<std::string>(ss),
std::istream_iterator<std::string>(),
[&counters](std::string const& word) { ++counters[word];}
)
...
}
If you don't have lambda then:
struct Helper
{
void operator()(std::string const& word) const {++counters[word];}
Helper(CounterType& c) : counters(c) {}
CounterType& counters;
};
static int myfunc(std::string const& stringInput)
{
std::stringstream ss(stringInput);
std::for_each(std::istream_iterator<std::string>(ss),
std::istream_iterator<std::string>(),
Helper(counters)
)
...
}
Use ostringstream, maybe
istringstream(stringInput); // initialize with the string
In Visual C++ 11 you can use regex_token_iterator from TR1.
sregex_token_iterator::regex_type white_space_separators("[[:space:]]+",regex_constants::optimize);
for(sregex_token_iterator i(s.begin(),s.()end,white_space_separators,-1),end; i!=end; i++)
{
cout << *i << endl;
// or use i.start, i.end which is faster access
}
If you concerned about performance (and overheads like string copying), you can write your own routine:
#include <ctype.h>
#include <string>
#include <iostream>
using namespace std;
int main()
{
string s = "Text for tokenization ";
const char *start = s.c_str();
const char *end = start + s.size();
const char *token = start;
while (start!=end)
{
if(isspace(*start))
{
if (token < start)
{
// Instead of constructing string, you can
// just use [token,start] part of the input buffer
cout << string(token,start) << ' ';
}
start++;
token = start;
}
else
{
start++;
}
}
if (token < start)
{
cout << string(token,start) << ' ';
}
}