I came across a situation where it would be useful to have unnecessary calls to realloc
being optimized out. However, it seems like neither Clang nor GCC do such a thing (Compiler Explorer (godbolt.org)) - although I see optimizations being made with multiple calls to malloc
.
The example:
void *myfunc() {
void *data;
data = malloc(100);
data = realloc(data, 200);
return data;
}
I expected it to be optimized to something like the following:
void *myfunc() {
return malloc(200);
}
Why is neither Clang nor GCC optimizing it out? - Are they not allowed to do so?
Maybe, but optimization not done in this case may be due to corner functional differences.
If 150 bytes of allocatable memory remain,
data = malloc(100); data = realloc(data, 200);
returnsNULL
with 100 bytes consumed (and leaked) and 50 remain.data = malloc(200);
returnsNULL
with 0 bytes consumed (none leaked) and 150 remain.Different functionality in this narrow case may prevent optimization.
Perhaps - I would expect it is allowed. Yet it may not be worth the effect to enhance the compiler to determine when it can.
Successful
malloc(n); ... realloc(p, 2*n)
differs frommalloc(2*n);
when...
may have set some of the memory.It might be beyond that compiler's design to ensure
...
, even if empty code, did not set any memory.The compiler is allowed to optimize out multiple calls to functions which are considered pure functions, i.e., functions that do not have any side-effects.
So the question is whether
realloc()
is a pure function or not.The C11 Standard Committee Draft N1570 states this about the
realloc
function:Notice that the compiler cannot predict the value of the pointer at compile time that will be returned from each call.
This means that
realloc()
cannot be considered a pure function, and multiple calls to it cannot be optimized out by the compiler.A compiler which bundles its own self-contained versions of malloc/calloc/free/realloc could legitimately perform the indicated optimization if the authors thought doing so was worth the effort. A compiler that chains to externally-supplied functions could still perform such optimizations if it documented that it did not regard the precise sequence of calls to such functions as an observable side-effect, but such treatment could be a bit more tenuous.
If no storage is allocated or deallocated between the malloc() and realloc(), the size of the realloc() is known when the malloc() is performed, and the realloc() size is larger than the malloc() size, then it may make sense to consolidate the malloc() and realloc() operations into a single larger allocation. If the state of memory could change in the interim, however, then such an optimization might cause the failure of operations that should have succeeded. For example, given the sequence:
a system might not have 2000000000 bytes available for p2 until after p1 is freed. If it were to change the code to:
that would result in the allocation of p2 failing. Because the Standard never guarantees that allocation requests will succeed, such behavior would not be non-conforming. On the other hand, the following would also be a "conforming" implementation:
Such an implementation might arguably be regarded as more "efficient" than most others, but one would have to be rather obtuse to regard it as being very useful except, perhaps, in rare situations where the above functions are are called on code paths that are never executed.
I think the Standard would clearly allow the optimization, at least in cases that are as simple as those in the original question. Even in cases where it might cause operations to fail that could otherwise have succeeded, the Standard would still allow it. Most likely, the reason that many compilers don't perform the optimization is that the authors didn't think the benefits would be sufficient to justify the effort required to identify cases where it would be safe and useful.
But you're not checking the return value of the first malloc() which you're then using in the second realloc(). It could just as well be NULL.
How could the compiler optimize the two calls into a single one without making unwarranted assumptions about the return value of the first?
Then there is another possible scenario. FreeBSD used to have a
realloc()
which was basically malloc + memcpy + free the old pointer.Suppose that there are only 230 bytes left of free memory. In that implementation,
ptr = malloc(100)
followed byrealloc(ptr, 200)
will fail, but a singlemalloc(200)
will succeed.My understanding is that such an optimization might be forbidden (notably for the -indeed unlikely- case where the
malloc
succeeds but the followingrealloc
fails).You could suppose that
malloc
andrealloc
always succeed (that is against the C11 standard, n1570; look also into my joke-implementation ofmalloc
). In that hypothesis (stricto sensu wrong, but some Linux systems have memory overcommitment to give that illusion), if you use GCC, you might write your own GCC plugin to make such an optimization.I am not sure it is worth spending a few weeks or months to code such a GCC plugin (in practice, you probably want it to handle sometimes some code between
malloc
andrealloc
, and then it is not that simple, since you have to characterize and detect what such in-between code is acceptable), but that choice is yours.