Split string with delimiters in C

2018-12-31 02:17发布

How do I write a function to split and return an array for a string with delimiters in the C programming language?

char* str = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC";
str_split(str,',');

标签: c string split
25条回答
美炸的是我
2楼-- · 2018-12-31 02:42

My version:

int split(char* str, const char delimeter, char*** args) {
    int cnt = 1;
    char* t = str;

    while (*t == delimeter) t++;

    char* t2 = t;
    while (*(t2++))
        if (*t2 == delimeter && *(t2 + 1) != delimeter && *(t2 + 1) != 0) cnt++;

    (*args) = malloc(sizeof(char*) * cnt);

    for(int i = 0; i < cnt; i++) {
        char* ts = t;
        while (*t != delimeter && *t != 0) t++;

        int len = (t - ts + 1);
        (*args)[i] = malloc(sizeof(char) * len);
        memcpy((*args)[i], ts, sizeof(char) * (len - 1));
        (*args)[i][len - 1] = 0;

        while (*t == delimeter) t++;
    }

    return cnt;
}
查看更多
余生无你
3楼-- · 2018-12-31 02:42

This is different approach, working for large files too. https://onlinegdb.com/BJlWVdzGf

查看更多
无与为乐者.
4楼-- · 2018-12-31 02:45

My code (tested):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int dtmsplit(char *str, const char *delim, char ***array, int *length ) {
  int i=0;
  char *token;
  char **res = (char **) malloc(0 * sizeof(char *));

  /* get the first token */
   token = strtok(str, delim);
   while( token != NULL ) 
   {
        res = (char **) realloc(res, (i + 1) * sizeof(char *));
        res[i] = token;
        i++;
      token = strtok(NULL, delim);
   }
   *array = res;
   *length = i;
  return 1;
}

int main()
{
    int i;
    int c = 0;
    char **arr = NULL;

    int count =0;

    char str[80] = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC";
    c = dtmsplit(str, ",", &arr, &count);
    printf("Found %d tokens.\n", count);

    for (i = 0; i < count; i++)
        printf("string #%d: %s\n", i, arr[i]);

   return(0);
}

Result:

Found 12 tokens.
string #0: JAN
string #1: FEB
string #2: MAR
string #3: APR
string #4: MAY
string #5: JUN
string #6: JUL
string #7: AUG
string #8: SEP
string #9: OCT
string #10: NOV
string #11: DEC
查看更多
泛滥B
5楼-- · 2018-12-31 02:45

This may solve your purpose

#include <stdio.h>
#include <string.h>

int main()
{
    int i = 0,j = 0,k = 0;
    char name[] = "jrSmith-Rock";
    int length = strlen(name);
    char store[100][100];
    for(i = 0, j = 0,k = 0; i < length;) {
        if((name[i] >= 'a' && name[i] <= 'z') || (name[i] >= 'A' && name[i] <= 'Z')) {
            store[j][k] = name[i];
            k++;
            i++;
        }
        else{
            while(! isalpha(name[i])) {
                i++;
            }
            j++;
            k = 0;
        }
    }

    for(i = 0; i <= j; i++) {
        printf("%s\n", store[i]);
    }
    return 0;
}

Output :

jrSmith
Rock
查看更多
孤独寂梦人
6楼-- · 2018-12-31 02:46

Not tested, probably wrong, but should give you a good head-start at how it should work:

*char[] str_split(char* str, char delim) {

    int begin = 0;
    int end = 0;
    int j = 0;
    int i = 0;
    char *buf[NUM];

    while (i < strlen(str)) {

        if(*str == delim) {

            buf[j] = malloc(sizeof(char) * (end-begin));
            strncpy(buf[j], *(str + begin), (end-begin));
            begin = end;
            j++;

        }

        end++;
        i++;

    }

    return buf;

}
查看更多
余欢
7楼-- · 2018-12-31 02:48

I think strsep is still the best tool for this:

while ((token = strsep(&str, ","))) my_fn(token);

That is literally one line that splits a string.

The extra parentheses are a stylistic element to indicate that we're intentionally testing the result of an assignment, not an equality operator ==.

For that pattern to work, token and str both have type char *. If you started with a string literal, then you'd want to make a copy of it first:

// More general pattern:
const char *my_str_literal = "JAN,FEB,MAR";
char *token, *str, *tofree;

tofree = str = strdup(my_str_literal);  // We own str's memory now.
while ((token = strsep(&str, ","))) my_fn(token);
free(tofree);

If two delimiters appear together in str, you'll get a token value that's the empty string. The value of str is modified in that each delimiter encountered is overwritten with a zero byte - another good reason to copy the string being parsed first.

In a comment, someone suggested that strtok is better than strsep because strtok is more portable. Ubuntu and Mac OS X have strsep; it's safe to guess that other unixy systems do as well. Windows lacks strsep, but it has strbrk which enables this short and sweet strsep replacement:

char *strsep(char **stringp, const char *delim) {
  if (*stringp == NULL) { return NULL; }
  char *token_start = *stringp;
  *stringp = strpbrk(token_start, delim);
  if (*stringp) {
    **stringp = '\0';
    (*stringp)++;
  }
  return token_start;
}

Here is a good explanation of strsep vs strtok. The pros and cons may be judged subjectively; however, I think it's a telling sign that strsep was designed as a replacement for strtok.

查看更多
登录 后发表回答