C memory allocator and strict aliasing

2019-01-18 06:23发布

even after reading quite a bit about the strict-aliasing rules I am still confused. As far as I have understood this, it is impossible to implement a sane memory allocator that follows these rules, because malloc can never reuse freed memory, as the memory could be used to store different types at each allocation.

Clearly this cannot be right. What am I missing? How do you implement an allocator (or a memory pool) that follows strict-aliasing?

Thanks.

Edit: Let me clarify my question with a stupid simple example:

// s == 0 frees the pool
void *my_custom_allocator(size_t s) {
    static void *pool = malloc(1000);
    static int in_use = FALSE;
    if( in_use || s > 1000 ) return NULL;
    if( s == 0 ) {
        in_use = FALSE;
        return NULL;
    }
    in_use = TRUE;
    return pool;
}

main() {
    int *i = my_custom_allocator(sizeof(int));
    //use int
    my_custom_allocator(0);
    float *f = my_custom_allocator(sizeof(float)); //not allowed...
}

4条回答
对你真心纯属浪费
2楼-- · 2019-01-18 06:39

Within the allocator itself, only refer to your memory buffers as (void *). when it is optimized, the strict-aliasing optimizations shouldn't be applied by the compiler (because that module has no idea what types are stored there). when that object gets linked into the rest of the system, it should be left well-enough alone.

Hope this helps!

查看更多
【Aperson】
3楼-- · 2019-01-18 06:43

I post this answer to test my understanding of strict aliasing:

Strict aliasing matters only when actual reads and writes happen. Just as using multiple members of different type of an union simultaneously is undefined behavior, the same applies to pointers as well: you cannot use pointers of different type to access the same memory for the same reason you cannot do it with an union.

If you consider only one of the pointers as live, then it's not a problem.

  • So if you write through an int* and read through an int*, it is OK.
  • If you write using an int* and read through an float*, it is bad.
  • If you write using an int* and later you write again using float*, then read it out using a float*, then it's OK.

In case of non-trivial allocators you have a large buffer, which you typically store it in a char*. Then you make some sort of pointer arithmetic to calculate the address you want to allocate and then dereference it through the allocator's header structs. It doesn't matter what pointers do you use to do the pointer arithmetic only the pointer you dereference the area through matters. Since in an allocator you always do that via the allocator's header struct, you won't trigger undefined behavior by that.

查看更多
狗以群分
4楼-- · 2019-01-18 06:44

I don't think you're right. Even the strictest of strict aliasing rules would only count when the memory is actually allocated for a purpose. Once an allocated block has been released back to the heap with free, there should be no references to it and it can be given out again by malloc.

And the void* returned by malloc is not subject to the strict aliasing rule since the standard explicitly states that a void pointer can be cast into any other sort of pointer (and back again). C99 section 7.20.3 states:

The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).


In terms of your update (the example) where you don't actually return the memory back to the heap, I think your confusion arises because allocated object are treated specially. If you refer to 6.5/6 of C99, you see:

The effective type of an object for an access to its stored value is the declared type of the object, if any (footnote 75: Allocated objects have no declared type).

Re-read that footnote, it's important.

If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.

For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

In other words, the allocated block contents will become the type of the data item that you put in there.

If you put a float in there, you should only access it as a float (or compatible type). If you put in an int, you should only process it as an int (or compatible type).

The one thing you shouldn't do is to put a specific type of variable into that memory and then try to treat it as a different type - one reason for this being that objects are allowed to have trap representations (which cause undefined behaviour) and these representations may occur due to treating the same object as different types.

So, if you were to store an int in there before the deallocation in your code, then reallocate it as a float pointer, you should not try to use the float until you've actually put one in there. Up until that point, the type of the allocated is not yet float.

查看更多
姐就是有狂的资本
5楼-- · 2019-01-18 06:50

Standard C does not define any efficient means by which a user-written memory allocator can safely take a region of memory that has been used as one type and make it safely available as another. Structures in C are guaranteed not to trap representations--a guarantee which would have little purpose if it didn't make it safe to copy structures with fields containing Indeterminate Value.

The difficulty is that given a structure and function like:

struct someStruct {unsigned char count; unsigned char dat[7]; }
void useStruct(struct someStruct s); // Pass by value

it should be possible to invoke it like:

someStruct *p = malloc(sizeof *p);
p->count = 1;
p->dat[0] = 42;
useStruct(*p);

without having to write all of the fields of the allocated structure first. Although malloc will guarantee that the allocation block it returns may be used by any type, there is no way for user-written memory-management functions to enable such reuse of storage without either clearing it in bytewise fashion (using a loop or memset) or else using free() and malloc() to recycle the storage.

查看更多
登录 后发表回答