malloc-free-malloc and strict-aliasing

2019-02-23 00:19发布

问题:

I've been trying to understand a particular aspect of strict aliasing recently, and I think I have made the smallest possible interesting piece of code. (Interesting for me, that is!)

Update: Based on the answers so far, it's clear I need to clarify the question. The first listing here is "obviously" defined behaviour, from a certain point of view. The real issue is to follow this logic through to custom allocators and custom memory pools. If I malloc a large block of memory at the start, and then write my own my_malloc and my_free that uses that single large block, then is it UB on the grounds that it doesn't use the official free?

I'll stick with C, somewhat arbitrarily. I get the impression it is easier to talk about, that the C standard is a bit clearer.

int main() {
    uint32_t *p32 = malloc(4);
    *p32 = 0;
    free(p32);

    uint16_t *p16 = malloc(4);
    p16[0] = 7;
    p16[1] = 7;
    free(p16);
}

It is possible that the second malloc will return the same address as the first malloc (because it was freed in between). That means that it is accessing the same memory with two different types, which violates strict aliasing. So surely the above is undefined behaviour (UB)?

(For simplicity, let's assume the malloc always succeeds. I could add in checks for the return value of malloc, but that would just clutter the question)

If it's not UB, why? Is there an explicit exception in the standard, which says that malloc and free (and calloc/realloc/...) are allowed to "delete" the type associated with a particular address, allowing further accesses to "imprint" a new type on the address?

If malloc/free are special, then does that mean I cannot legally write my own allocator which clones the behaviour of malloc? I'm sure there are plenty of projects out there with custom allocators - are they all UB?

Custom allocators

If we decide, therefore, that such custom allocators must be defined behaviour, then it means the strict aliasing rule is essentially "incorrect". I would update it to say that it is possible to write (not read) through a pointer of a different ('new') type as long as you don't use pointers of the old type any more. This wording could be quietly-ish changed if it was confirmed that all compilers have essentially obeyed this new rule anyway.

I get the impression that gcc and clang essentially respect my (aggressive) reinterpretation. If so, perhaps the standards should be edited accordingly? My 'evidence' regarding gcc and clang is difficult to describe, it uses memmove with an identical source and destination (which is therefore optimized out) in such a way that it blocks any undesirable optimizations because it tells the compiler that future reads through the destination pointer will alias the bit pattern that was previously written through the source pointer. I was able to block the undesirable interpretations accordingly. But I guess this isn't really 'evidence', maybe I was just lucky. UB clearly means that the compiler is also allowed to give me misleading results!


( ... unless, of course, there is another rule that makes memcpy and memmove special in the same way that malloc may be special. That they are allowed to change the type to the type of the destination pointer. That would be consistent with my 'evidence'. )


Anyway, I'm rambling. I guess a very short answer would be: "Yes, malloc (and friends) are special. Custom allocators are not special and are therefore UB, unless they maintain separate memory pools for each type. And, further, see example X for an extreme piece of code where compiler Y does undesirable stuff precisely because compiler Y is very strict in this regard and is contradicting this reinterpretation."


Follow up: what about non-malloced memory? Does the same thing apply. (Local variables, static variables, ...)

回答1:

Note: This only answers initial question, not the part about custom allocators.


No, it's not UB, because p16 now holds different object and the former is gone after you invoked free(p32).

Note that malloc() returns pointer that is pre-aligned for every object, thus this avoids breaking of strict aliasing in practical terms. From C11 (N1570) 7.22.3/p1 Memory management functions (emphasis mine):

The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated). The lifetime of an allocated object extends from the allocation until the deallocation.



回答2:

Here are the C99 strict aliasing rules in (what I hope is) their entirety:

6.5
(6) The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

(7) An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.

These two clauses together prohibit one specific case, storing a value via an lvalue of type X and then subsequently retrieving a value via an lvalue of type Y incompatible with X.

So, as I read the standard, even this usage is perfectly OK (assuming 4 bytes are enough to store either an uint32_t or two uint16_t).

int main() {
    uint32_t *p32 = malloc(4);
    *p32 = 0;
    /* do not do this: free(p32); */

    /* do not do this: uint16_t *p16 = malloc(4); */
    /* do this instead: */
    uint16_t *p16 = (uint16_t *)p32;

    p16[0] = 7;
    p16[1] = 7;
    free(p16);
}

There's no rule that prohibits storing an uint32_t and then subsequently storing an uint16_t at the same address, so we're perfectly OK.

Thus there's nothing that would prohibit writing a fully compliant pool allocator.



回答3:

Your code is correct C and does not invoke undefined behaviour (except that you do not test malloc return value) because :

  • you allocate a bloc of memory, use it and free it
  • you allocate another bloc of memory, use it and free it.

What is undefined is whether p16 will receive same value as p32 had at a different time

What would be undefined behaviour, even if value was the same would be to access p32 after it has been freed. Examples :

int main() {
    uint32_t *p32 = malloc(4);
    *p32 = 0;
    free(p32);

    uint16_t *p16 = malloc(4);
    p16[0] = 7;
    p16[1] = 7;
    if (p16 == p32) {         // whether p16 and p32 are equal is undefined
        uint32_t x = *p32;  // accessing *p32 is explicitely UB
    }
    free(p16);
}

It is UB because you try to access a memory block after it has been freed. And even when it does point to a memory block, that memory block has been initialized as an array of uint16_t, using it as a pointer to another type is formally undefined behaviour.


Custom allocation (assuming a C99 conformant compiler) :

So you have a big chunk of memory and want to write custom free and malloc functions without UB. It is possible. Here I will not go to far into the hard part of management of allocated and free blocs, and just give hints.

  1. you will need to know what it the strictest alignement for the implementation. stdlib malloc knows it because 7.20.3 §1 of C99 language specification (draft n1256) says : The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object. It is generally 4 on 32 bits systems and 8 on 64 bits systems, but might be greater or lesser ...
  2. you memory pool must be a char array because 6.3.2.3 §7 says : A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object. : that means that provided you can deal with the alignement, a character array of correct size can be converted to a pointer to an arbitrary type (and is the base of malloc implementation)
  3. You must make your memory pool start at an address compatible with the system alignement :

    intptr_t orig_addr = chunk;
    int delta = orig_addr % alignment;
    char *pool = chunk + alignement - delta; /* pool in now aligned */
    

You now only have to return from your own pool addresses of blocs got as pool + n * alignement and converted to void * : 6.3.2.3 §1 says : A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

It would be cleaner with C11, because C11 explicitely added _Alignas and alignof keywords to explictely deal with it and it would be better than the current hack. But it should work nonetheless

Limits :

I must admit that my interpretation of 6.3.2.3 §7 is that a pointer to a correctly aligned char array can be converted to a pointer of another type is not really neat and clear. Some may argue that what is said is just that if it originally pointed to the other type, it can be used as a char pointer. But as I start from a char pointer it is not explicitely allowed. That's true, but it is the best that can be done, it is not explicely marked as undefined behaviour ... and it is what malloc does under the hood.

As alignement is explicitely implementation dependant, you cannot create a general library usable on any implementation.



回答4:

The actual rules regarding aliasing are laid out in standard section 6.5, paragraph 7. Note the wording:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

(emphasis mine)

Aliasing includes the notion of objects, not just general memory. For malloc to have returned the same address on a second use requires the original object to have been deallocated. Even if it has the same address, it is not considered the same object. Any attempts to access the first object through dangling pointers leftover after free are UB for completely different reasons, so there's no aliasing because any continued use of the first pointer p32 is invalid anyway.