read before write is undefined with malloced memor

2020-04-02 09:24发布

问题:

According to this reddit comment thread, it is undefined if an attempt is made to read memory before it has been written to. I'm referring to normal heap memory which has been succesfully malloced.

... note that this is not strictly valid C: the compiler/runtime system is allowed to initialize uninitialized memory with so-called trap representations, which cause undefined behavior on access.

I find this hard to believe. Is there a Standard quote?

Of course, I understand that there is no guarantee that the memory has been zeroed out. The values in this uninitialized memory are essentially pseudo-random or arbitrary. But I can't really believe that the Standard would refer to this as undefined behaviour (in the sense that it might segfault, or delete all your files, or whatever). The rest of the reddit thread there didn't cast any more light on this issue.

回答1:

If accessing through a char*, this is defined. But otherwise, this is undefined behavior.

(C99, 7.20.3.3) "The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate."

on indeterminate value:

(C99, 3.17.2p1) "indeterminate value: either an unspecified value or a trap representation"

on trap representation reading through a non-character type being undefined behavior:

(C99, 6.2.6.1p5) "Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. [...] Such a representation is called a trap representation."



回答2:

It rationally has to be undefined. Otherwise, the necessary behavior of a C program running under something like Valgrind, which diagnoses reads of uninitialized memory and throws appropriate errors when they occur, would be illegal under the standard.

Reading the standard, the key question is whether the values of malloc'ed memory are "unspecified values" (which must be some readable value), or "indeterminate values" (which may contain trap representations; c.f. definition 3.17.2.)

As per 7.20.3.3, quoted in the other answers, malloc returns a block of memory which contains indeterminate values, and therefore may contain trap representations. The relevant discussion of trap representations is 6.2.6.1, part 5:

Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. ... Such a representation is called a trap representation.

So, there you go. Basically, the C implementation is permitted to detect (i.e., "trap") references to indeterminate values, and deal with that error how it chooses, including in undefined ways.



回答3:

ISO/IEC 9899:1999, 7.20.3.3 The malloc function:

The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate.

6.2.6.1 Representation of types, §5:

Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined.

And footnote 41 makes it even more explicit (at least for automatic variables):

Thus, an automatic variable can be initialized to a trap representation without causing undefined behavior, but the value of the variable cannot be used until a proper value is stored in it.