ISO/IEC 9899:TC2 (i.e. the C99 standard), §7.20.3 states:
If the size of the space requested is zero, the behavior is implementation-defined:
either a null pointer is returned, or the behavior is as if the size were some
nonzero value, except that the returned pointer shall not be used to access an object.
In other words, malloc(0) may either return NULL or a valid pointer which I may not dereference.
What is the rationale behind this behavior?
And wouldn't it be easier to just define that malloc(0) leads to UB?
The C99 Rationale (PDF link) discusses the memory management functions (from C99 7.20.3) and explains:
The treatment of null pointers and zero-length allocation requests in the definition of these functions was in part guided by a desire to support this paradigm:
OBJ * p; // pointer to a variable list of OBJs
/* initial allocation */
p = (OBJ *) calloc(0, sizeof(OBJ));
/* ... */
/* reallocations until size settles */
while(1) {
p = (OBJ *) realloc((void *)p, c * sizeof(OBJ));
/* change value of c or break out of loop */
}
This coding style, not necessarily endorsed by the Committee, is reported to be in widespread use.
Some implementations have returned non-null values for allocation requests of zero bytes.
Although this strategy has the theoretical advantage of distinguishing between "nothing" and "zero" (an unallocated pointer vs. a pointer to zero-length space), it has the more compelling theoretical disadvantage of requiring the concept of a zero-length object.
Since such objects Library cannot be declared, the only way they could come into existence would be through such allocation requests.
The C89 Committee decided not to accept the idea of zero-length objects. The allocation
functions may therefore return a null pointer for an allocation request of zero bytes. Note that this treatment does not preclude the paradigm outlined above.
QUIET CHANGE IN C89: A program which relies on size-zero allocation requests returning a non-null pointer will behave differently.
Because allocating 0 bytes may actually have sense. For example, when you are allocating an array with unknown number of items. UB would allow the program crash, whereas with the current behaviour you can safely allocate numberOfItems * itemSize
bytes.
The logic is following: if you ask for 0 bytes, you get a pointer back. Of course, you must not dereference it, as this would have accessed 0-th byte (which you haven't allocated). But you can safely free the memory afterwards. So you don't need to make 0 a special case.
This was about why not define malloc(0)
as UB. About the decision not to define the result strictly (NULL
vs. unique pointer to empty space) see James' answer. (In short: both approaches have their advantages and disadvantages. The idea of returning a unique non-null pointer is more compelling, but requires more conceptual work and puts more burden on implementors.)
Making malloc(0)
result in UB would be much worse. As it stands, you don't have to care what happens when the size is zero, as long as you're consistent about calling free
.
The problem is that some existing implementations allocate a pointer for malloc(0)
, some return null pointers, and almost all of them are adamant about sticking with their behavior because the same people who wrote the implementations wrote lots of bad, non-portable software that makes use of their chosen behavior (GNU being among the worst offenders in this area). Thus the standard got stuck allowing both behaviors to keep them all happy.