Parsing comma-delimited numbers in C++

2020-04-21 05:18发布

问题:

I have a quick question for everyone. I'm trying to write a simple code to extract numbers form user input and save them to an int array, but I'm having a hard time wrapping my mind around how to make it work. The code shown below works well for single-digit numbers, but not so much for numbers with more than 1 digit.

For instance, if user enters: 1,2,3,4,50,60 here is what I get:

Enter numbers (must be comma delimited): 1,2,3,4,50,60
My numbers are: 12345060
My parsed numbers are: 1
My parsed numbers are: 2
My parsed numbers are: 3
My parsed numbers are: 4
My parsed numbers are: 5
My parsed numbers are: 0

Question: how can I modify this simple piece of code to accurately capture numbers with more than 1 digit? Thanks in advance!!

#include <iostream>
#include <iomanip>
#include <string>
#include <sstream>
using namespace std;


// set up some variables
int numbers[100];


int main() {

// Enter numbers (comma delimited). Ex: 1,2,3,4,50,60<return>
cout << endl << endl << "Enter numbers (must be comma delimited): ";

string nums_in;
getline(cin, nums_in);
nums_in.erase(remove(nums_in.begin(), nums_in.end(), ','), nums_in.end());  // remove the unwanted commas

cout << "My numbers are: " << nums_in << endl;


// convert each char into int
for (int o = 0; o < 6; o++) {
    istringstream buf(nums_in.substr(o,1));
    buf >> numbers[o];
    cout << "My parsed numbers are: " << numbers[o] << endl;
}
cout << endl << endl;

cout << "Done." << endl;
return 0;

}

回答1:

In your program, you first remove the "unwanted" commas in the input string and then run into the problem that you cannot distinguish the numbers in the input line any more. So it seems as if these commas are not unwanted after all. The solution is to parse the string step by step without removing the commas first, as you need them to split up the input string. Here is an example.

#include <iostream>
#include <iomanip>
#include <string>
#include <sstream>
#include <vector>

int main() {

    // Enter numbers (comma delimited). Ex: 1,2,3,4,50,60<return>
    std::cout << std::endl << std::endl << "Enter numbers (must be comma delimited): ";
    std::string nums_in;
    std::getline(std::cin, nums_in);

    // Now parse
    std::vector<int> data;
    std::istringstream buf(nums_in);
    while (!buf.eof()) {
        int this_number;
        buf >> this_number;
        if (buf.bad()) {
            std::cerr << "Number formatting error!\n";
            return 1;
        }
        data.push_back(this_number);
        char comma = 0;
        buf >> comma;
        if (!buf.eof()) {
            if (buf.fail()) {
                std::cerr << "Could not read comma.\n";
                return 1;
            }
            if (comma!=',') {
                std::cerr << "Found no comma but '" << comma << "' instead !\n";
                return 1;
            }
        }
    }

    std::cout << "My numbers are:";
    for (auto a : data) {
        std::cout << " " << a;
    }
    std::cout << std::endl;

    std::cout << "Done." << std::endl;
    return 0;
}

Note that I did not use "using namespace std;" as it is considered to be bad style. Also, I used a C++11 feature for printing out the values and used a vector to store the numbers - in your code, typing in a line with 200 numbers would lead to a crash (or other bad behavior). Finally, the parsing error handling is not yet complete. Making it complete and correct is left as an exercise. An alternative to the istringstream-based approach would be to first split the line by the commas and then to read all numbers separately using istringstreams.

By the way, your question is so practical that it would have been better suited for the standard stackexchange site - the connection to computer science is quite weak.



回答2:

To solve this kind of problems, you have to write a scanner. The scanner breaks input into tokens. Once you have the ability to break the input into tokens, you can check the order of tokens (see parsers).

In your case you have three tokens: number, comma and end. An example of valid input: number comma number end. Another example: end (empty input). An example of invalid input: number number end (there is no comma between numbers).

Below it is a possible solution to your problem. get_token reads a token from input and stores it in token and number globals. get_numbers reads tokens, checks the syntax and stores the numbers in numbers; the count of numbers is stored in count (also global variables).

#include <iostream>
#include <cctype>

enum { max_count = 100 };
int numbers[max_count];
int count;

enum token_type
{
  token_unknwon,
  token_end,
  token_comma,
  token_number
};

token_type token;
int number;

token_type get_token()
{
  char c;

  // get a character, but skip ws except newline
  while ( std::cin.get( c ) && c != '\n' && std::isspace( c ) )
    ;
  if ( ! std::cin || c == '\n' )
    return token = token_end;

  if ( c == ',' )
    return token = token_comma;

  if ( std::isdigit( c ) )
  {
    std::cin.unget();
    std::cin >> number;
    return token = token_number;
  }

  return token_unknwon;
}

enum error_type
{
  error_ok,
  error_number_expected,
  error_too_many_numbers,
  error_comma_expected
};

int get_numbers()
{
  // 
  if ( get_token() == token_end )
    return error_ok; // empty input

  while ( true )
  {
    // number expected
    if ( token != token_number )
      return error_number_expected;

    // store the number
    if ( count >= max_count )
      return error_too_many_numbers;
    numbers[count++] = number;

    // this might be the last number
    if ( get_token() == token_end )
      return error_ok;

    // not the last number, comma expected
    if ( token != token_comma )
      return error_comma_expected;

    // prepare next token
    get_token();
  }

}

int main()
{
  //...
  switch ( get_numbers() )
  {
  case error_ok: break;
  case error_comma_expected: std::cout << "comma expected"; return -1;
  case error_number_expected: std::cout << "number expected"; return -2;
  case error_too_many_numbers: std::cout << "too many numbers"; return -3;
  }

  //
  std::cout << count << " number(s): ";
  for ( int i = 0; i < count; ++i )
    std::cout << numbers[i] << ' ';
  //...
  return 0;
}


回答3:

This task can be easily done using std::getline to read the entire line in a string and then parse that string using a std::istringstream to extract the individual numbers and skip the commas.

#include <iostream>
#include <sstream>
#include <vector>

using std::cout;

int main() {   
    // Enter numbers (comma delimited). Ex: 1,2,3,4,50,60<return>
    cout << "\nEnter numbers (must be comma delimited): ";

    int x;
    std::vector<int> v;
    std::string str_in;

    // read the whole line then use a stringstream buffer to extract the numbers
    std::getline(std::cin, str_in);
    std::istringstream str_buf{str_in};

    while ( str_buf >> x ) {
        v.push_back(x);
        // If the next char in input is a comma, extract it. std::ws discards whitespace
        if ( ( str_buf >> std::ws).peek() == ',' ) 
            str_buf.ignore();
    }

    cout << "\nMy parsed numbers are:\n";
    for ( int i : v ) {
        cout << i << '\n';
    }

    cout << "\nDone.\n";

    return 0;

}


回答4:

Hm... How about parsing the string without removing the commas? Read the string character for character, placing each character in a temp buffer until you hit a comma, then convert the temp buffer to an int and store it in the vector. Empty the temp buffer and repeat.

#include <iostream>
#include <iomanip>
#include <string>
#include <vector>
#include <sstream>
using namespace std;


// set up some variables
vector<int> numbers(0);


int main() {

// Enter numbers (comma delimited). Ex: 1,2,3,4,50,60<return>
cout << endl << endl << "Enter numbers (must be comma delimited): ";

string nums_in;
getline(cin, nums_in);

cout << "My numbers are: " << nums_in << endl;

string s_tmp = "";
int i_tmp;

for(vector<int>::size_type i = 0, len = nums_in.size(); i < len; i++){
    if( nums_in[i] == ',' ){
        if(s_tmp.size() > 0){
            i_tmp = std::stoi(s_tmp);
            numbers.push_back(i_tmp);
        }
        s_tmp = "";
    }
    else if( i == len-1){
        s_tmp += nums_in[i];
        i_tmp = std::stoi(s_tmp);
        numbers.push_back(i_tmp);
        cout << "My parsed numbers are:" << i_tmp << endl;
    }
    else {
        s_tmp += nums_in[i];
    }
}

cout << endl << endl;

cout << "Done." << endl;
return 0;

}


标签: c++ parsing