How does strtok() split the string into tokens in

2018-12-31 21:29发布

Please explain me the working of strtok() function.The manual says it breaks the string into tokens. I am unable to understand from the manual what actually it does.

I added watches on str and *pch to check its working, when the first while loop occurred, the contents of str were only "this". How did the output shown below printed on the screen?

/* strtok example */
#include <stdio.h>
#include <string.h>

int main ()
{
  char str[] ="- This, a sample string.";
  char * pch;
  printf ("Splitting string \"%s\" into tokens:\n",str);
  pch = strtok (str," ,.-");
  while (pch != NULL)
  {
    printf ("%s\n",pch);
    pch = strtok (NULL, " ,.-");
  }
  return 0;
}

Output:

Splitting string "- This, a sample string." into tokens:
This
a
sample
string

13条回答
看淡一切
2楼-- · 2018-12-31 22:01

strtok() divides the string into tokens. i.e. starting from any one of the delimiter to next one would be your one token. In your case, the starting token will be from "-" and end with next space " ". Then next token will start from " " and end with ",". Here you get "This" as output. Similarly the rest of the string gets split into tokens from space to space and finally ending the last token on "."

查看更多
十年一品温如言
3楼-- · 2018-12-31 22:02

strtok will tokenize a string i.e. convert it into a series of substrings.

It does that by searching for delimiters that separate these tokens (or substrings). And you specify the delimiters. In your case, you want ' ' or ',' or '.' or '-' to be the delimiter.

The programming model to extract these tokens is that you hand strtok your main string and the set of delimiters. Then you call it repeatedly, and each time strtok will return the next token it finds. Till it reaches the end of the main string, when it returns a null. Another rule is that you pass the string in only the first time, and NULL for the subsequent times. This is a way to tell strtok if you are starting a new session of tokenizing with a new string, or you are retrieving tokens from a previous tokenizing session. Note that strtok remembers its state for the tokenizing session. And for this reason it is not reentrant or thread safe (you should be using strtok_r instead). Another thing to know is that it actually modifies the original string. It writes '\0' for teh delimiters that it finds.

One way to invoke strtok, succintly, is as follows:

char str[] = "this, is the string - I want to parse";
char delim[] = " ,-";
char* token;

for (token = strtok(str, delim); token; token = strtok(NULL, delim))
{
    printf("token=%s\n", token);
}

Result:

this
is
the
string
I
want
to
parse
查看更多
初与友歌
4楼-- · 2018-12-31 22:04

This is how i implemented strtok, Not that great but after working 2 hr on it finally got it worked. It does support multiple delimiters.

#include "stdafx.h"
#include <iostream>
using namespace std;

char* mystrtok(char str[],char filter[]) 
{
    if(filter == NULL) {
        return str;
    }
    static char *ptr = str;
    static int flag = 0;
    if(flag == 1) {
        return NULL;
    }
    char* ptrReturn = ptr;
    for(int j = 0; ptr != '\0'; j++) {
        for(int i=0 ; filter[i] != '\0' ; i++) {
            if(ptr[j] == '\0') {
                flag = 1;
                return ptrReturn;
            }
            if( ptr[j] == filter[i]) {
                ptr[j] = '\0';
                ptr+=j+1;
                return ptrReturn;
            }
        }
    }
    return NULL;
}

int _tmain(int argc, _TCHAR* argv[])
{
    char str[200] = "This,is my,string.test";
    char *ppt = mystrtok(str,", .");
    while(ppt != NULL ) {
        cout<< ppt << endl;
        ppt = mystrtok(NULL,", ."); 
    }
    return 0;
}
查看更多
与风俱净
5楼-- · 2018-12-31 22:10

strtok() stores the pointer in static variable where did you last time left off , so on its 2nd call , when we pass the null , strtok() gets the pointer from the static variable .

If you provide the same string name , it again starts from beginning.

Moreover strtok() is destructive i.e. it make changes to the orignal string. so make sure you always have a copy of orignal one.

One more problem of using strtok() is that as it stores the address in static variables , in multithreaded programming calling strtok() more than once will cause an error. For this use strtok_r().

查看更多
泛滥B
6楼-- · 2018-12-31 22:12

The first time you call it, you provide the string to tokenize to strtok. And then, to get the following tokens, you just give NULL to that function, as long as it returns a non NULL pointer.

The strtok function records the string you first provided when you call it. (Which is really dangerous for multi-thread applications)

查看更多
与风俱净
7楼-- · 2018-12-31 22:18

strtok replaces the characters in the second argument with a NULL and a NULL character is also the end of a string.

http://www.cplusplus.com/reference/clibrary/cstring/strtok/

查看更多
登录 后发表回答