Are compilers allowed to optimize out realloc?

2020-05-19 19:16发布

I came across a situation where it would be useful to have unnecessary calls to realloc being optimized out. However, it seems like neither Clang nor GCC do such a thing (Compiler Explorer (godbolt.org)) - although I see optimizations being made with multiple calls to malloc.

The example:

void *myfunc() {
    void *data;
    data = malloc(100);
    data = realloc(data, 200);
    return data;
}

I expected it to be optimized to something like the following:

void *myfunc() {
    return malloc(200);
}

Why is neither Clang nor GCC optimizing it out? - Are they not allowed to do so?

5条回答
We Are One
2楼-- · 2020-05-19 19:35

Are they not allowed to do so?

Maybe, but optimization not done in this case may be due to corner functional differences.


If 150 bytes of allocatable memory remain,
data = malloc(100); data = realloc(data, 200); returns NULL with 100 bytes consumed (and leaked) and 50 remain.

data = malloc(200); returns NULL with 0 bytes consumed (none leaked) and 150 remain.

Different functionality in this narrow case may prevent optimization.


Are compilers allowed to optimize-out realloc?

Perhaps - I would expect it is allowed. Yet it may not be worth the effect to enhance the compiler to determine when it can.

Successful malloc(n); ... realloc(p, 2*n) differs from malloc(2*n); when ... may have set some of the memory.

It might be beyond that compiler's design to ensure ..., even if empty code, did not set any memory.

查看更多
家丑人穷心不美
3楼-- · 2020-05-19 19:58

The compiler is allowed to optimize out multiple calls to functions which are considered pure functions, i.e., functions that do not have any side-effects.

So the question is whether realloc() is a pure function or not.

The C11 Standard Committee Draft N1570 states this about the realloc function:

7.22.3.5 The realloc function ... 2. The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size. The contents of the new object shall be the same as that of the old object prior to deallocation, up to the lesser of the new and old sizes. Any bytes in the new object beyond the size of the old object have indeterminate values.

Returns 4. The realloc function returns a pointer to the new object (which may have the same value as a pointer to the old object), or a null pointer if the new object could not be allocated.

Notice that the compiler cannot predict the value of the pointer at compile time that will be returned from each call.

This means that realloc() cannot be considered a pure function, and multiple calls to it cannot be optimized out by the compiler.

查看更多
欢心
4楼-- · 2020-05-19 19:59

A compiler which bundles its own self-contained versions of malloc/calloc/free/realloc could legitimately perform the indicated optimization if the authors thought doing so was worth the effort. A compiler that chains to externally-supplied functions could still perform such optimizations if it documented that it did not regard the precise sequence of calls to such functions as an observable side-effect, but such treatment could be a bit more tenuous.

If no storage is allocated or deallocated between the malloc() and realloc(), the size of the realloc() is known when the malloc() is performed, and the realloc() size is larger than the malloc() size, then it may make sense to consolidate the malloc() and realloc() operations into a single larger allocation. If the state of memory could change in the interim, however, then such an optimization might cause the failure of operations that should have succeeded. For example, given the sequence:

void *p1 = malloc(2000000000);
void *p2 = malloc(2);
free(p1);
p2 = realloc(p2, 2000000000);

a system might not have 2000000000 bytes available for p2 until after p1 is freed. If it were to change the code to:

void *p1 = malloc(2000000000);
void *p2 = malloc(2000000000);
free(p1);

that would result in the allocation of p2 failing. Because the Standard never guarantees that allocation requests will succeed, such behavior would not be non-conforming. On the other hand, the following would also be a "conforming" implementation:

void *malloc(size_t size) { return 0; }
void *calloc(size_t size, size_t count) { return 0; }
void free(void *p) {  }
void *realloc(void *p, size_t size) { return 0; }

Such an implementation might arguably be regarded as more "efficient" than most others, but one would have to be rather obtuse to regard it as being very useful except, perhaps, in rare situations where the above functions are are called on code paths that are never executed.

I think the Standard would clearly allow the optimization, at least in cases that are as simple as those in the original question. Even in cases where it might cause operations to fail that could otherwise have succeeded, the Standard would still allow it. Most likely, the reason that many compilers don't perform the optimization is that the authors didn't think the benefits would be sufficient to justify the effort required to identify cases where it would be safe and useful.

查看更多
The star\"
5楼-- · 2020-05-19 19:59

But you're not checking the return value of the first malloc() which you're then using in the second realloc(). It could just as well be NULL.

How could the compiler optimize the two calls into a single one without making unwarranted assumptions about the return value of the first?

Then there is another possible scenario. FreeBSD used to have a realloc() which was basically malloc + memcpy + free the old pointer.

Suppose that there are only 230 bytes left of free memory. In that implementation, ptr = malloc(100) followed by realloc(ptr, 200) will fail, but a single malloc(200) will succeed.

查看更多
狗以群分
6楼-- · 2020-05-19 19:59

My understanding is that such an optimization might be forbidden (notably for the -indeed unlikely- case where the malloc succeeds but the following  realloc fails).

You could suppose that malloc and realloc always succeed (that is against the C11 standard, n1570; look also into my joke-implementation of malloc). In that hypothesis (stricto sensu wrong, but some Linux systems have memory overcommitment to give that illusion), if you use GCC, you might write your own GCC plugin to make such an optimization.

I am not sure it is worth spending a few weeks or months to code such a GCC plugin (in practice, you probably want it to handle sometimes some code between malloc and realloc, and then it is not that simple, since you have to characterize and detect what such in-between code is acceptable), but that choice is yours.

查看更多
登录 后发表回答