realloc() dangling pointers and undefined behavior

2019-01-20 16:09发布

When you free memory, what happens to pointers that point into that memory? Do they become invalid immediately? What happens if they later become valid again?

Certainly, the usual case of a pointer going invalid then becoming "valid" again would be some other object getting allocated into what happens to be the memory that was used before, and if you use the pointer to access memory, that's obviously undefined behavior. Dangling pointer memory overwrite lesson 1, pretty much.

But what if the memory becomes valid again for the same allocation? There's only one Standard way for that to happen: realloc(). If you have a pointer to somewhere within a malloc()'d memory block at offset > 1, then use realloc() to shrink the block to less than your offset, your pointer becomes invalid, obviously. If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block, is the dangling pointer valid again?

This is such a corner case that I don't really know how to interpret the C or C++ standards to figure it out. The below is a program that shows it.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    static const char s_message[] = "hello there";
    static const char s_kitty[] = "kitty";

    char *string = malloc(sizeof(s_message));
    if (!string)
    {
        fprintf(stderr, "malloc failed\n");
        return 1;
    }

    memcpy(string, s_message, sizeof(s_message));
    printf("%p %s\n", string, string);

    char *overwrite = string + 6;
    *overwrite = '\0';
    printf("%p %s\n", string, string);

    string[4] = '\0';
    char *new_string = realloc(string, 5);
    if (new_string != string)
    {
        fprintf(stderr, "realloc #1 failed or moved the string\n");
        free(new_string ? new_string : string);
        return 1;
    }
    string = new_string;
    printf("%p %s\n", string, string);

    new_string = realloc(string, 6 + sizeof(s_kitty));
    if (new_string != string)
    {
        fprintf(stderr, "realloc #2 failed or moved the string\n");
        free(new_string ? new_string : string);
        return 1;
    }

    // Is this defined behavior, even though at one point,
    // "overwrite" was a dangling pointer?
    memcpy(overwrite, s_kitty, sizeof(s_kitty));
    string[4] = s_message[4];
    printf("%p %s\n", string, string);
    free(string);
    return 0;
}

3条回答
来,给爷笑一个
2楼-- · 2019-01-20 16:38

If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block, is the dangling pointer valid again?

No. Unless realloc() returns a null pointer, the call terminates the lifetime of the allocated object, implying that all pointers pointing into it become invalid. If realloc() succeeds, it returns the address of a new object.

Of course, it just might happen that it's the same address as the old one. In that case, using an invalid pointer to the old object to access the new one will generally work in non-optimizing implementations of the C language.

It would still be undefined behaviour, though, and might actually fail with aggressively optimizing compilers.

The C language is unsound, and it's generally up to the programmer to uphold its invariants. Failing to do so will break the implicit contract with the compiler and may result in incorrect code being generated.

查看更多
成全新的幸福
3楼-- · 2019-01-20 16:39

It depends on your definition of "valid". You've perfectly described the situation. If you want to consider that "valid", then it's valid. If you don't want to consider that "valid", then it's invalid.

查看更多
Bombasti
4楼-- · 2019-01-20 16:40

When you free memory, what happens to pointers that point into that memory? Do they become invalid immediately?

Yes, definitely. From section 6.2.4 of the C standard:

The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

And from section 7.22.3.5:

The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size. The contents of the new object shall be the same as that of the old object prior to deallocation, up to the lesser of the new and old sizes. Any bytes in the new object beyond the size of the old object have indeterminate values.

Note the reference to old object and new object ... by the standard, what you get back from realloc is a different object than what you had before; it's no different from doing a free and then a malloc, and there is no guarantee that the two objects have the same address, even if the new size is <= the old size ... and in real implementations they often won't because objects of different sizes are drawn from different free lists.

What happens if they later become valid again?

There's no such animal. Validity isn't some event that takes place, it's an abstract condition placed by the C standard. Your pointers might happen to work in some implementation, but all bets are off once you free the memory they point into.

But what if the memory becomes valid again for the same allocation? There's only one Standard way for that to happen: realloc()

Sorry, no, the C Standard does not contain any language to that effect.

If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block

You can't know whether it will ... the standard does not guarantee any such thing. And notably, when you realloc to a smaller size, most implementations modify the memory immediately following the shortened block; reallocing back to the original size will have some garbage in the added part, it won't be what it was before it was shrunk. In some implementations, some block sizes are kept on lists for that block size; reallocating to a different size will give you totally different memory. And in a program with multiple threads, any freed memory can be allocated in a different thread between the two reallocs, in which case the realloc for a larger size will be forced to move the object to a different location.

is the dangling pointer valid again?

See above; invalid is invalid; there's no going back.

This is such a corner case that I don't really know how to interpret the C or C++ standards to figure it out.

It's not any sort of corner case and I don't know what you're seeing in the standard, which is quite clear that freed memory has indeteterminate content and that the values of any pointers to or into it are also indeterminate, and makes no claim that they are magically restored by a later realloc.

Note that modern optimizing compilers are written to know about undefined behavior and take advantage of it. As soon as you realloc string, overwrite is invalid, and the compiler is free to trash it ... e.g., it might be in a register that the compiler reallocates for temporaries or parameter passing. Whether any compiler does this, it can, precisely because the standard is quite clear about pointers into objects becoming invalid when the object's lifetime ends.

查看更多
登录 后发表回答