C++ Extract number from the middle of a string

2019-01-18 09:12发布

I have a vector containing strings that follow the format of text_number-number

Eg: Example_45-3

I only want the first number (45 in the example) and nothing else which I am able to do with my current code:

std::vector<std::string> imgNumStrVec;
for(size_t i = 0; i < StrVec.size(); i++){
    std::vector<std::string> seglist;
    std::stringstream ss(StrVec[i]);
    std::string seg, seg2;
    while(std::getline(ss, seg, '_')) seglist.push_back(seg);
    std::stringstream ss2(seglist[1]);
    std::getline(ss2, seg2, '-');
    imgNumStrVec.push_back(seg2); 
}

Are there more streamlined and simpler ways of doing this? and if so what are they?

I ask purely out of desire to learn how to code better as at the end of the day, the code above does successfully extract just the first number, but it seems long winded and round-about.

8条回答
地球回转人心会变
2楼-- · 2019-01-18 09:42

Using @Pixelchemist's answer and e.g. std::stoul:

bool getFirstNumber(std::string const & a_str, unsigned long & a_outVal)
{
    auto pos = a_str.find_first_of("0123456789");

    try
    {
        if (std::string::npos != pos)
        {
            a_outVal = std::stoul(a_str.substr(pos));

            return true;
        }
    }
    catch (...)
    {
        // handle conversion failure
        // ...
    }

    return false;
}
查看更多
The star\"
3楼-- · 2019-01-18 09:50

This should be more efficient than Ashot Khachatryan's solution. Note the use of '_' and '-' instead of "_" and "-". And also, the starting position of the search for '-'.

inline std::string mid_num_str(const std::string& s) {
    std::string::size_type p  = s.find('_');
    std::string::size_type pp = s.find('-', p + 2); 
    return s.substr(p + 1, pp - p - 1);
}

If you need a number instead of a string, like what Alexandr Lapenkov's solution has done, you may also want to try the following:

inline long mid_num(const std::string& s) {
    return std::strtol(&s[s.find('_') + 1], nullptr, 10);
}
查看更多
Ridiculous、
4楼-- · 2019-01-18 09:59

I can think of two ways of doing it:

  • Use regular expressions
  • Use an iterator to step through the string, and copy each consecutive digit to a temporary buffer. Break when it reaches an unreasonable length or on the first non-digit after a string of consecutive digits. Then you have a string of digits that you can easily convert.
查看更多
Summer. ? 凉城
5楼-- · 2019-01-18 09:59

The 'best' way to do this in C++11 and later is probably using regular expressions, which combine high expressiveness and high performance when the test is repeated often enough.

The following code demonstrates the basics. You should #include <regex> for it to work.

// The example inputs
std::vector<std::string> inputs {
    "Example_0-0", "Example_0-1", "Example_0-2", "Example_0-3", "Example_0-4",
    "Example_1-0", "Example_1-1", "Example_1-2", "Example_1-3", "Example_1-4"
};

// The regular expression. A lot of the cost is incurred when building the
// std::regex object, but when it's reused a lot that cost is amortised.
std::regex imgNumRegex { "^[^_]+_([[:digit:]]+)-([[:digit:]]+)$" };

for (const auto &input: inputs){
    // This wil contain the match results. Parts of the regular expression
    // enclosed in parentheses will be stored here, so in this case: both numbers
    std::smatch matchResults;

    if (!std::regex_match(input, matchResults, imgNumRegex)) {
        // Handle failure to match
        abort();
    }

    // Note that the first match is in str(1). str(0) contains the whole string
    std::string theFirstNumber = matchResults.str(1);
    std::string theSecondNumber = matchResults.str(2);

    std::cout << "The input had numbers " << theFirstNumber;
    std::cout << " and " << theSecondNumber << std::endl;
}
查看更多
神经病院院长
6楼-- · 2019-01-18 10:01

updated for C++11 2018-12-04

I tried to update this answer to use C++11 on my machine but failed because my g++ compiler does not have full <regex> support... so I kept getting uncaught std::regex_error code=4 (i.e. "missing bracket") exceptions for any regex with bracket character classes std::regex("[0-9]").

Apparently full support for C++11 <regex> was implemented and released for g++ version 4.9.x and on Jun 26, 2015. Hat tip to SO questions #1 and #2 for figuring out the compiler version needing to be 4.9.x.

Here is the C++11 code that should work but I wasn't able to test it:

#include <iostream>
#include <string>
#include <regex>

using std::cout;
using std::endl;

int main() {
    std::string input = "Example_45-3";
    std::string output = std::regex_replace(
        input,
        std::regex("[^0-9]*([0-9]+).*"),
        std::string("\\1")
        );
    cout << input << endl;
    cout << output << endl;
}

boost solution that only requires C++98

Minimal implementation example that works on many strings (not just strings of the form "text_45-text":

#include <iostream>
#include <string>
using namespace std;
#include <boost/regex.hpp>

int main() {
    string input = "Example_45-3";
    string output = boost::regex_replace(
        input,
        boost::regex("[^0-9]*([0-9]+).*"),
        string("\\1")
        );
    cout << input << endl;
    cout << output << endl;
}

console output:

Example_45-3
45

Other example strings that this would work on:

  • "asdfasdf 45 sdfsdf"
  • "X = 45, sdfsdf"

For this example I used g++ on Linux with #include <boost/regex.hpp> and -lboost_regex. You could also use C++11x regex.

Feel free to edit my solution if you have a better regex.


Commentary:

If there aren't performance constraints, using Regex is ideal for this sort of thing because you aren't reinventing the wheel (by writing a bunch of string parsing code which takes time to write/test-fully).

Additionally if/when your strings become more complex or have more varied patterns regex easily accommodates the complexity. (The question's example pattern is easy enough. But often times a more complex pattern would take 10-100+ lines of code when a one line regex would do the same.)

查看更多
手持菜刀,她持情操
7楼-- · 2019-01-18 10:01

You can also use the built in find_first_of and find_first_not_of to find the first "numberstring" in any string.

std::string first_numberstring(std::string const & str)
{
  std::size_t const n = str.find_first_of("0123456789");
  if (n != std::string::npos)
  {
    std::size_t const m = str.find_first_not_of("0123456789", n);
    return str.substr(n, m != std::string::npos ? m-n : m);
  }
  return std::string();
}
查看更多
登录 后发表回答