Why is strtok() Considered Unsafe?

2019-01-03 18:05发布

What feature(s) of strtok is unsafe (in terms of buffer overflow) that I need to watch out for?

What's a little weird to me is that strtok_s (which is "safe") in Visual C++ has an extra "context" parameter, but it looks like it's the same in other ways... is it the same, or is it actually different?

4条回答
We Are One
2楼-- · 2019-01-03 18:11

If you do not have a properly null terminated string; you will end up in a buffer overflow. Also note (this is something that I learned the hard way) strtok does NOT seem to care about internal strings. I.E. having "hello"/"world" will parse "hello"/"world" whereas "hello/world" will parse into "hello world". Notice that it splits on the / and ignores the fact that it is within a parenthesis.

查看更多
干净又极端
3楼-- · 2019-01-03 18:13

strtok is safe in Visual C++ (but nowhere else), as it uses thread local storage to save its state between calls. Everywhere else, global variable is used to save strtok() state.

However even in VC++, where strtok is thread-safe it is still still a bit weird - you cannot use strtok()s on different strings in the same thread at the same time. For example this would not work well:

     token = strtok( string, seps );
     while(token)
     {
        printf("token=%s\n", token)
        token2 = strtok(string2, seps);
        while(token2)  
        {
            printf("token2=%s", token2);
            token2 = strtok( NULL, seps );
        }
        token = strtok( NULL, seps );
     }

The reason why it would not work well- for every thread only single state can be saved in thread local storage, and here one would need 2 states - for the first string and for the second string. So while strtok is thread-safe with VC++, it is not reentrant.

What strtok_s (or strtok_r everywhere else) provides - an explicit state, and with that strtok becomes reentrant.

查看更多
爱情/是我丢掉的垃圾
4楼-- · 2019-01-03 18:21

There is nothing unsafe about it. You just need to understand how it works and how to use it. After you write your code and unit test, it only takes a couple of extra minutes to re-run the unit test with valgrind to make sure you are operating withing memory bounds. The man page says it all:

BUGS

Be cautious when using these functions. If you do use them, note that:

  • These functions modify their first argument.
  • These functions cannot be used on constant strings.
  • The identity of the delimiting character is lost.
  • The strtok() function uses a static buffer while parsing, so it's not thread safe. Use strtok_r() if this matters to you.
查看更多
三岁会撩人
5楼-- · 2019-01-03 18:28

According with the strtok_s section of this document:

6.7.3.1 The strtok_s function The strtok_s function fixes two problems in the strtok function:

  1. A new parameter, s1max, prevents strtok_s from storing outside of the string being tokenized. (The string being divided into tokens is both an input and output of the function since strtok_s stores null characters into the string.)
  2. A new parameter, ptr, eliminates the static internal state that prevents strtok from being re-entrant (Subclause 1.1.12). (The ISO/IEC 9899 function wcstok and the ISO/IEC 9945 (POSIX) function strtok_r fix this problem identically.)
查看更多
登录 后发表回答