可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Please explain me the working of strtok()
function.The manual says it breaks the string into tokens. I am unable to understand from the manual what actually it does.
I added watches on str
and *pch
to check its working, when the first while loop occurred, the contents of str
were only \"this\". How did the output shown below printed on the screen?
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] =\"- This, a sample string.\";
char * pch;
printf (\"Splitting string \\\"%s\\\" into tokens:\\n\",str);
pch = strtok (str,\" ,.-\");
while (pch != NULL)
{
printf (\"%s\\n\",pch);
pch = strtok (NULL, \" ,.-\");
}
return 0;
}
Output:
Splitting string \"- This, a sample string.\" into tokens:
This
a
sample
string
回答1:
strtok()
divides the string into tokens. i.e. starting from any one of the delimiter to next one would be your one token. In your case, the starting token will be from \"-\" and end with next space \" \". Then next token will start from \" \" and end with \",\". Here you get \"This\" as output. Similarly the rest of the string gets split into tokens from space to space and finally ending the last token on \".\"
回答2:
the strtok runtime function works like this
the first time you call strtok you provide a string that you want to tokenize
char s[] = \"this is a string\";
in the above string space seems to be a good delimiter between words so lets use that:
char* p = strtok(s, \" \");
what happens now is that \'s\' is searched until the space character is found, the first token is returned (\'this\') and p points to that token (string)
in order to get next token and to continue with the same string NULL is passed as first
argument since strtok maintains a static pointer to your previous passed string:
p = strtok(NULL,\" \");
p now points to \'is\'
and so on until no more spaces can be found, then the last string is returned as the last token \'string\'.
more conveniently you could write it like this instead to print out all tokens:
for (char *p = strtok(s,\" \"); p != NULL; p = strtok(NULL, \" \"))
{
puts(p);
}
EDIT:
If you want to store the returned values from strtok
you need to copy the token to another buffer e.g. strdup(p);
since the original string (pointed to by the static pointer inside strtok
) is modified between iterations in order to return the token.
回答3:
strtok
maintains a static, internal reference pointing to the next available token in the string; if you pass it a NULL pointer, it will work from that internal reference.
This is the reason strtok
isn\'t re-entrant; as soon as you pass it a new pointer, that old internal reference gets clobbered.
回答4:
strtok
doesn\'t change the parameter itself (str
). It stores that pointer (in a local static variable). It can then change what that parameter points to in subsequent calls without having the parameter passed back. (And it can advance that pointer it has kept however it needs to perform its operations.)
From the POSIX strtok
page:
This function uses static storage to keep track of the current string position between calls.
There is a thread-safe variant (strtok_r
) that doesn\'t do this type of magic.
回答5:
The first time you call it, you provide the string to tokenize to strtok
. And then, to get the following tokens, you just give NULL
to that function, as long as it returns a non NULL
pointer.
The strtok
function records the string you first provided when you call it. (Which is really dangerous for multi-thread applications)
回答6:
strtok will tokenize a string i.e. convert it into a series of substrings.
It does that by searching for delimiters that separate these tokens (or substrings). And you specify the delimiters. In your case, you want \' \' or \',\' or \'.\' or \'-\' to be the delimiter.
The programming model to extract these tokens is that you hand strtok your main string and the set of delimiters. Then you call it repeatedly, and each time strtok will return the next token it finds. Till it reaches the end of the main string, when it returns a null. Another rule is that you pass the string in only the first time, and NULL for the subsequent times. This is a way to tell strtok if you are starting a new session of tokenizing with a new string, or you are retrieving tokens from a previous tokenizing session. Note that strtok remembers its state for the tokenizing session. And for this reason it is not reentrant or thread safe (you should be using strtok_r instead). Another thing to know is that it actually modifies the original string. It writes \'\\0\' for teh delimiters that it finds.
One way to invoke strtok, succintly, is as follows:
char str[] = \"this, is the string - I want to parse\";
char delim[] = \" ,-\";
char* token;
for (token = strtok(str, delim); token; token = strtok(NULL, delim))
{
printf(\"token=%s\\n\", token);
}
Result:
this
is
the
string
I
want
to
parse
回答7:
strtok modifies its input string. It places null characters (\'\\0\') in it so that it will return bits of the original string as tokens. In fact strtok does not allocate memory. You may understand it better if you draw the string as a sequence of boxes.
回答8:
To understand how strtok()
works, one first need to know what a static variable is. This link explains it quite well....
The key to the operation of strtok()
is preserving the location of the last seperator between seccessive calls (that\'s why strtok()
continues to parse the very original string that is passed to it when it is invoked with a null pointer
in successive calls)..
Have a look at my own strtok()
implementation, called zStrtok()
, which has a sligtly different functionality than the one provided by strtok()
char *zStrtok(char *str, const char *delim) {
static char *static_str=0; /* var to store last address */
int index=0, strlength=0; /* integers for indexes */
int found = 0; /* check if delim is found */
/* delimiter cannot be NULL
* if no more char left, return NULL as well
*/
if (delim==0 || (str == 0 && static_str == 0))
return 0;
if (str == 0)
str = static_str;
/* get length of string */
while(str[strlength])
strlength++;
/* find the first occurance of delim */
for (index=0;index<strlength;index++)
if (str[index]==delim[0]) {
found=1;
break;
}
/* if delim is not contained in str, return str */
if (!found) {
static_str = 0;
return str;
}
/* check for consecutive delimiters
*if first char is delim, return delim
*/
if (str[0]==delim[0]) {
static_str = (str + 1);
return (char *)delim;
}
/* terminate the string
* this assignmetn requires char[], so str has to
* be char[] rather than *char
*/
str[index] = \'\\0\';
/* save the rest of the string */
if ((str + index + 1)!=0)
static_str = (str + index + 1);
else
static_str = 0;
return str;
}
And here is an example usage
Example Usage
char str[] = \"A,B,,,C\";
printf(\"1 %s\\n\",zStrtok(s,\",\"));
printf(\"2 %s\\n\",zStrtok(NULL,\",\"));
printf(\"3 %s\\n\",zStrtok(NULL,\",\"));
printf(\"4 %s\\n\",zStrtok(NULL,\",\"));
printf(\"5 %s\\n\",zStrtok(NULL,\",\"));
printf(\"6 %s\\n\",zStrtok(NULL,\",\"));
Example Output
1 A
2 B
3 ,
4 ,
5 C
6 (null)
The code is from a string processing library I maintain on Github, called zString. Have a look at the code, or even contribute :)
https://github.com/fnoyanisi/zString
回答9:
strtok replaces the characters in the second argument with a NULL and a NULL character is also the end of a string.
http://www.cplusplus.com/reference/clibrary/cstring/strtok/
回答10:
Here is my implementation which uses hash table for the delimiter, which means it O(n) instead of O(n^2) (here is a link to the code):
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define DICT_LEN 256
int *create_delim_dict(char *delim)
{
int *d = (int*)malloc(sizeof(int)*DICT_LEN);
memset((void*)d, 0, sizeof(int)*DICT_LEN);
int i;
for(i=0; i< strlen(delim); i++) {
d[delim[i]] = 1;
}
return d;
}
char *my_strtok(char *str, char *delim)
{
static char *last, *to_free;
int *deli_dict = create_delim_dict(delim);
if(!deli_dict) {
/*this check if we allocate and fail the second time with entering this function */
if(to_free) {
free(to_free);
}
return NULL;
}
if(str) {
last = (char*)malloc(strlen(str)+1);
if(!last) {
free(deli_dict);
return NULL;
}
to_free = last;
strcpy(last, str);
}
while(deli_dict[*last] && *last != \'\\0\') {
last++;
}
str = last;
if(*last == \'\\0\') {
free(deli_dict);
free(to_free);
deli_dict = NULL;
to_free = NULL;
return NULL;
}
while (*last != \'\\0\' && !deli_dict[*last]) {
last++;
}
*last = \'\\0\';
last++;
free(deli_dict);
return str;
}
int main()
{
char * str = \"- This, a sample string.\";
char *del = \" ,.-\";
char *s = my_strtok(str, del);
while(s) {
printf(\"%s\\n\", s);
s = my_strtok(NULL, del);
}
return 0;
}
回答11:
strtok() stores the pointer in static variable where did you last time left off , so on its 2nd call , when we pass the null , strtok() gets the pointer from the static variable .
If you provide the same string name , it again starts from beginning.
Moreover strtok() is destructive i.e. it make changes to the orignal string. so make sure you always have a copy of orignal one.
One more problem of using strtok() is that as it stores the address in static variables , in multithreaded programming calling strtok() more than once will cause an error. For this use strtok_r().
回答12:
This is how i implemented strtok, Not that great but after working 2 hr on it finally got it worked. It does support multiple delimiters.
#include \"stdafx.h\"
#include <iostream>
using namespace std;
char* mystrtok(char str[],char filter[])
{
if(filter == NULL) {
return str;
}
static char *ptr = str;
static int flag = 0;
if(flag == 1) {
return NULL;
}
char* ptrReturn = ptr;
for(int j = 0; ptr != \'\\0\'; j++) {
for(int i=0 ; filter[i] != \'\\0\' ; i++) {
if(ptr[j] == \'\\0\') {
flag = 1;
return ptrReturn;
}
if( ptr[j] == filter[i]) {
ptr[j] = \'\\0\';
ptr+=j+1;
return ptrReturn;
}
}
}
return NULL;
}
int _tmain(int argc, _TCHAR* argv[])
{
char str[200] = \"This,is my,string.test\";
char *ppt = mystrtok(str,\", .\");
while(ppt != NULL ) {
cout<< ppt << endl;
ppt = mystrtok(NULL,\", .\");
}
return 0;
}
回答13:
For those who are still having hard time understanding this strtok()
function, take a look at this pythontutor example, it is a great tool to visualize your C (or C++, Python ...) code.
In case the link got broken, paste in:
#include <stdio.h>
#include <string.h>
int main()
{
char s[] = \"Hello, my name is? Matthew! Hey.\";
char* p;
for (char *p = strtok(s,\" ,?!.\"); p != NULL; p = strtok(NULL, \" ,?!.\")) {
puts(p);
}
return 0;
}
Credits go to Anders K.