Are strtol, strtod unsafe?

2019-01-27 13:37发布

It seems that strtol() and strtod() effectively allow (and force) you to cast away constness in a string:

#include <stdlib.h>
#include <stdio.h>

int main() {
  const char *foo = "Hello, world!";
  char *bar;
  strtol(foo, &bar, 10); // or strtod(foo, &bar);
  printf("%d\n", foo == bar); // prints "1"! they're equal
  *bar = 'X'; // segmentation fault
  return 0;
}

Above, I did not perform any casts myself. However, strtol() basically cast my const char * into a char * for me, without any warnings or anything. (In fact, it wouldn't allow you to type bar as a const char *, and so forces the unsafe change in type.) Isn't that really dangerous?

4条回答
我命由我不由天
2楼-- · 2019-01-27 13:59

I would guess that because the alternative was worse. Suppose the prototype were changed to add const:

long int strtol(const char *nptr, const char **endptr, int base);

Now, suppose we want to parse a non-constant string:

char str[] = "12345xyz";  // non-const
char *endptr;
lont result = strtol(str, &endptr, 10);
*endptr = '_';
printf("%s\n", str);  // expected output: 12345_yz

But what happens when we try to compile this code? A compiler error! It's rather non-intuitive, but you can't implicitly convert a char ** to a const char **. See the C++ FAQ Lite for a detailed explanation of why. It's technically talking about C++ there, but the arguments are equally valid for C. In C/C++, you're only allowed to implicitly convert from "pointer to type" to "pointer to const type" at the highest level: the conversion you can perform is from char ** to char * const *, or equivalently from "pointer to (pointer to char)" to "pointer to (const pointer to char)".

Since I would guess that parsing a non-constant string is far more likely than parsing a constant string, I would go on to postulate that const-incorrectness for the unlikely case is preferable to making the common case a compiler error.

查看更多
We Are One
3楼-- · 2019-01-27 14:07

The 'const char *' for the first argument means that strtol() won't modify the string.

What you do with the returned pointer is your business.

Yes, it could be regarded as a type safety violation; C++ would probably do things differently (though, as far as I can tell, ISO/IEC 14882:1998 defines <cstdlib> with the same signature as in C).

查看更多
我只想做你的唯一
4楼-- · 2019-01-27 14:07

I have a compiler that provides, when compiling in C++ mode:

extern "C" {
long int strtol(const char *nptr, const char **endptr, int base);
long int strtol(char *nptr, char **endptr, int base);
}

Obviously these both resolve to the same link-time symbol.

EDIT: according to the C++ standard, this header should not compile. I'm guessing the compiler simply didn't check for this. The definitions did in fact appear as this in the system header files.

查看更多
闹够了就滚
5楼-- · 2019-01-27 14:21

Yes, and other functions have the same "const-laundering" issue (for instance strchr, strstr, all that lot).

For precisely this reason C++ adds overloads (21.4:4): the function signature strchr(const char*, int) is replaced by the two declarations:

const char* strchr(const char* s, int c);
      char* strchr(      char* s, int c);

But of course in C you can't have both const-correct versions with the same name, so you get the const-incorrect compromise.

C++ doesn't mention similar overloads for strtol and strtod, and indeed my compiler (GCC) doesn't have them. I don't know why not: the fact that you can't implicitly cast char** to const char** (together with the absence of overloading) explains it for C, but I don't quite see what would be wrong with a C++ overload:

long strtol(const char*, const char**, int);
查看更多
登录 后发表回答