Why is the endptr parameter to strtof and strtod a

2019-02-11 12:44发布

问题:

The standard C library functions strtof and strtod have the following signatures:

float strtof(const char *str, char **endptr);
double strtod(const char *str, char **endptr); 

They each decompose the input string, str, into three parts:

  1. An initial, possibly-empty, sequence of whitespace
  2. A "subject sequence" of characters that represent a floating-point value
  3. A "trailing sequence" of characters that are unrecognized (and which do not affect the conversion).

If endptr is not NULL, then *endptr is set to a pointer to the character immediately following the last character that was part of the conversion (in other words, the start of the trailing sequence).

I am wondering: why is endptr, then, a pointer to a non-const char pointer? Isn't *endptr a pointer into a const char string (the input string str)?

回答1:

The reason is simply usability. char * can automatically convert to const char *, but char ** cannot automatically convert to const char **, and the actual type of the pointer (whose address gets passed) used by the calling function is much more likely to be char * than const char *. The reason this automatic conversion is not possible is that there is a non-obvious way it can be used to remove the const qualification through several steps, where each step looks perfectly valid and correct in and of itself. Steve Jessop has provided an example in the comments:

if you could automatically convert char** to const char**, then you could do

char *p;
char **pp = &p;
const char** cp = pp;
*cp = (const char*) "hello";
*p = 'j';.

For const-safety, one of those lines must be illegal, and since the others are all perfectly normal operations, it has to be cp = pp;

A much better approach would have been to define these functions to take void * in place of char **. Both char ** and const char ** can automatically convert to void *. (The stricken text was actually a very bad idea; not only does it prevent any type checking, but C actually forbids objects of type char * and const char * to alias.) Alternatively, these functions could have taken a ptrdiff_t * or size_t * argument in which to store the offset of the end, rather than a pointer to it. This is often more useful anyway.

If you like the latter approach, feel free to write such a wrapper around the standard library functions and call your wrapper, so as to keep the rest of your code const-clean and cast-free.



回答2:

Usability. The str argument is marked as const because the input argument will not be modified. If endptr were const, then that would instruct the caller that he should not change data referenced from endptr on output, but often the caller wants to do just that. For example, I may want to null-terminate a string after getting the float out of it:

float StrToFAndTerminate(char *Text) {
    float Num;

    Num = strtof(Text, &Text);
    *Text = '\0';
    return Num;
}

Perfectly reasonable thing to want to do, in some circumstances. Doesn't work if endptr is of type const char **.

Ideally, endptr should be of const-ness matching the actual input const-ness of str, but C provides no way of indicating this through its syntax. (Anders Hejlsberg talks about this when describing why const was left out of C#.)