How do I parse out the fields in a comma separated

2019-02-01 01:58发布

I have a comma separated string which might contain empty fields. For example:

1,2,,4

Using a basic

sscanf(string,"%[^,],%[^,],%[^,],%[^,],%[^,]", &val1, &val2, &val3, &val4);

I get all the values prior to the empty field, and unexpected results from the empty field onwards.

When I remove the expression for the empty field from the sscanf(),

sscanf(string,"%[^,],%[^,],,%[^,],%[^,]", &val1, &val2, &val3, &val4);

everything works out fine.

Since I don't know when I'm going to get an empty field, is there a way to rewrite the expression to handle empty fields elegantly?

10条回答
ゆ 、 Hurt°
2楼-- · 2019-02-01 02:43

I made a modification for tab delimited TSV files, hopefully it may help:

//rm token_tab;gcc -Wall -O3 -o token_tab token_tab.c; ./token_tab 
#include <stdio.h>
#include <string.h>

int main ()
{
//  char str[] = " 1     2 x         text   4 ";
    char str[] = " 1\t 2 x\t\t text\t4 ";
    char *s1;
    char *s2;
    s2=(void*)&str; //this is here to avoid warning of assignment from incompatible pointer type 
        do {
            while( *s2 == ' ')  s2++;
            s1 = strsep( &s2, "\t" );
            if( !*s1 ){
                printf("val: (empty)\n" );
            }
            else{
                int val;
                char ch;
                int ret = sscanf( s1, " %i %c", &val, &ch );
                if( ret != 1 ){
                    printf("val: (syntax error or string)=%s\n", s1 );
                }
                else{
                    printf("val: %i\n", val );
                }
            }
        } while (s2!=0 );
        return 0;
    }

And the ouput:

val: 1
val: (syntax error or string)=2 x
val: (empty)
val: (syntax error or string)=text
val: 4
查看更多
够拽才男人
3楼-- · 2019-02-01 02:48

Here is my version to scan comma separated int values. The code detect empty and non-integer fields.

#include <stdio.h> 
#include <string.h> 

int main(){
  char str[] = " 1 , 2 x, , 4 ";
  printf("str: '%s'\n", str );

  for( char *s2 = str; s2; ){
    while( *s2 == ' ' || *s2 == '\t' ) s2++;
    char *s1 = strsep( &s2, "," );
    if( !*s1 ){
      printf("val: (empty)\n" );
    }
    else{
      int val;
      char ch;
      int ret = sscanf( s1, " %i %c", &val, &ch );
      if( ret != 1 ){
        printf("val: (syntax error)\n" );
      }
      else{
        printf("val: %i\n", val );
      }
    }
  }

  return 0;
}

Result:

str: ' 1 , 2 x, , 4 '
val: 1
val: (syntax error)
val: (empty)
val: 4
查看更多
三岁会撩人
4楼-- · 2019-02-01 02:48

Put a '*' after the '%' to skip reading. In addition it is possible to read only 3 characters noting '%3s' for example.

查看更多
倾城 Initia
5楼-- · 2019-02-01 02:50

man sscanf:

[ Matches a nonempty sequence of characters from the specified set of accepted characters;

(emphasis added).

查看更多
Evening l夕情丶
6楼-- · 2019-02-01 02:52

This looks like you are currently dealing with CSV values. If you need to extend it to handle quoted strings (so that fields can contain commas, for example), you will find that the scanf-family can't handle all the complexities of the format. Thus, you will need to use code specifically designed to handle (your variant of) CSV-format.

You will find a discussion of a set CSV library implementations in 'The Practice of Programming' - in C and C++. No doubt there are many others available.

查看更多
Rolldiameter
7楼-- · 2019-02-01 02:53

If you use strtok with the comma as your separator character you'll get a list of strings one or more of which will be null/zero length.

Have a look at my answer here for more information.

查看更多
登录 后发表回答