Does Standard define null pointer constant to have

2019-01-20 17:49发布

问题:

( I'm quoting ISO/IEC 9899:201x )

Here we see that, integer constant expression has an integer type:

6.6 Constant expressions

6. An integer constant expression shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, _Alignof expressions, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof or _Alignof operator.

Then this holds true for any integer type:

6.2.6.2 Integer types

5. The values of any padding bits are unspecified.A valid (non-trap) object representation of a signed integer type where the sign bit is zero is a valid object representation of the corresponding unsigned type, and shall represent the same value. For any integer type, the object representation where all the bits are zero shall be a representation of the value zero in that type.

Then we see that a null pointer constant is defined using an integer constant expression with the value 0.

6.3.2.3 Pointers

3. An integer constant expression with the value 0, or such an expression cast to type void*, is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

Therefore the null pointer constant must have all it's bits set to zero.

But there are many answers online and on StackOverflow that say that that isn't true.

I have a hard time believing them given the quoted parts.

( Please answer using references to the latest Standard )

回答1:

No, NULL doesn't have to be all bits zero.

N1570 6.3.2.3 Pointers paragraph 3:

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. 66) If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

See my emphasis above: Integer 0 is converted if necessary, it doesn't have to have same bit presentation.

Note 66 on bottom of the page says:

66) The macro NULL is defined in (and other headers) as a null pointer constant; see 7.19.

Which leads us to a paragraph of that chapter:

The macros are

NULL

which expands to an implementation-defined null pointer constant

And what is more, on Annex J.3.12 (Portability issues, Implementation-defined behaviour, Library functions) says:

— The null pointer constant to which the macro NULL expands (7.19).



回答2:

Does Standard define null pointer constant to have all bits set to zero?

No, it doesn't. No paragraph of the C Standard impose such a requirement.

void *p = 0;

p for example is a null pointer, but the Standard does not require that the object p must have all bit set.

For information the c-faq website mentions some systems with non-zero null pointer representations here: http://c-faq.com/null/machexamp.html



回答3:

Asking about the representation of a null pointer constant is quite pointless.

A null pointer constant either has an integer type or the type void*. Whatever it is, it is a value. It is not an object. Values don't have a representation, only objects have. We can only talk about representations by taking the address of an object, casting it to char* or unsigned char*, and looking at the bytes. We can't do that with a null pointer constant. As soon as it is assigned to an object, it's not a null pointer constant anymore.



回答4:

A major limitation of the C standard is that because the authors want to avoid prohibiting compilers from behaving in any ways that any production code anywhere might be relying upon, it fails to specify many things which programmers need to know. As a consequence, it is often necessary make assumptions about things which are not specified by the standard, but match the behaviors of common compilers. The fact that all of the bytes comprising a null pointer are zero is one such assumption.

Nothing in the C standard specifies anything about the bit-level representation of any pointer beyond the fact that every possible value of each and every data type--including pointers--will be representable as a sequence of char values(*). Nonetheless, on nearly all common platforms platforms zeroing out all the bytes associated with a structure is equivalent to setting all the members to the static default values for their types (the default value for a pointer being null). Further, code which uses calloc to receive a zeroed-out a block of RAM for a collection of structures will often be much faster than code which uses malloc and then has to manually clear every member of every structure, or which uses calloc and but still manually clears every non-integer member of every structure.

I would suggest therefore that in many cases it is perfectly reasonable to write code targeted for those dialects of C where null pointers are stored as all-bytes-zero, and have as a documented requirement that it will not work on dialects where that is not the case. Perhaps someday the ISO will provide a standard means by which such requirements could be documented in machine-readable form (such that every compiler would be required to either abide by a program's stated requirements or refuse compilation), but so far as I know none yet exists.

(*) From what I understand, there's some question as to whether compilers are required to honor that assumption anymore. Consider, for example:

int funcomp(int **pp, int **qq)
{
    int *p,*q;
    p = (int*)malloc(1234);
    *pp = p;
    free(p);
    q = (int*)malloc(1234);
    *qq = q;
    *q = 1234;
    if (!memcmp(pp, qq, sizeof p))
      return *p;
    return 0;
 }

Following free(p) any attempt to access *p will be Undefined Behavior. Although there's a significant likelihood that q will receive the exact same bit pattern as p, nothing in the standard would require that p must be considered a valid alias for q even in that scenario. On the other hand, it also seems strange to say that two variables of the same type can hold the exact same bits without their contents being equivalent. Thus, while it's clearly natural that the function would be allowed to either return 0 along with values of *pp and *qq that don't compare bit-wise equal, or 1234 along with values of *pp and *qq that do compare bit-wise equal, the Standard would seem to allow the function to behave arbitrarily if both malloc happen to yield bitwise-equivalent values.