If free() knows the length of my array, why can

2019-01-13 15:28发布

问题:

I know that it's a common convention to pass the length of dynamically allocated arrays to functions that manipulate them:

void initializeAndFree(int* anArray, size_t length);

int main(){
    size_t arrayLength = 0;
    scanf("%d", &arrayLength);
    int* myArray = (int*)malloc(sizeof(int)*arrayLength);

    initializeAndFree(myArray, arrayLength);
}

void initializeAndFree(int* anArray, size_t length){
    int i = 0;
    for (i = 0; i < length; i++) {
        anArray[i] = 0;
    }
    free(anArray);
}

but if there's no way for me to get the length of the allocated memory from a pointer, how does free() "automagically" know what to deallocate when all I'm giving it is the very same pointer? Why can't I get in on the magic, as a C programmer?

Where does free() get its free (har-har) knowledge from?

回答1:

Besides Klatchko's correct point that the standard does not provide for it, real malloc/free implementations often allocate more space then you ask for. E.g. if you ask for 12 bytes it may provide 16 (see A Memory Allocator, which notes that 16 is a common size). So it doesn't need to know you asked for 12 bytes, just that it gave you a 16-byte chunk.



回答2:

You can't get it because the C committee did not require that in the standard.

If you are willing to write some non-portable code, you may have luck with:

*((size_t *)ptr - 1)

or maybe:

*((size_t *)ptr - 2)

But whether that works will depend on exactly where the implementation of malloc you are using stores that data.



回答3:

While it is possible to get the meta-data that the memory allocator places preceding the allocated block, this would only work if the pointer is truly a pointer to a dynamically allocated block. This would seriously affect the utility of function requiring that all passed arguments were pointers to such blocks rather than say a simple auto or static array.

The point is there is no portable way from inspection of the pointer to know what type of memory it points to. So while it is an interesting idea, it is not a particularly safe proposition.

A method that is safe and portable would be to reserve the first word of the allocation to hold the length. GCC (and perhaps some other compilers) supports a non-portable method of implementing this using a structure with a zero length array which simplifies the code somewhat compared to a portable solution:

typedef tSizedAlloc
{
    size_t length ;
    char* alloc[0] ;   // Compiler specific extension!!!
} ;

// Allocating a sized block
tSizedAlloc* blk = malloc( sizeof(tSizedAlloc) + length ) ;
blk->length = length ;

// Accessing the size and data information of the block
size_t blk_length = blk->length ;
char*  data = blk->alloc ;


回答4:

After reading Klatchko's answer, I myself tried it and ptr[-1] indeed stores the actual memory (usually more than the memory we asked for probably to save against segmentation fault).

{
  char *a = malloc(1);
  printf("%u\n", ((size_t *)a)[-1]);   //prints 17
  free(a);
  exit(0);
}

Trying with different sizes, GCC allocates the memory as follows:

Initially memory allocated is 17 bytes.
The allocated memory is atleast 5 bytes more than requested size, if more is requested, it allocates 8 bytes more.

  • If size is [0,12], memory allocated is 17.
  • If size is [13], memory allocated is 25.
  • If size is [20], memory allocated is 25.
  • If size is [21], memory allocated is 33.


回答5:

I know this thread is a little old, but still I have something to say. There is a function (or a macro, I haven't checked the library yet) malloc_usable_size() - obtains size of block of memory allocated from heap. The man page states that it's only for debugging, since it outputs not the number you've asked but the number it has allocated, which is a little bigger. Notice it's a GNU extention.

On the other hand, it may not even be needed, because I believe that to free memory chunk you don't have to know its size. Just remove the handle/descriptor/structure that is in charge for the chunk.



回答6:

A non-standard way is to use _msize(). Using this function will make your code unportable. Also the documentation is not very clear on wheteher it will return the number passed into malloc() or the real block size (might be greater).



回答7:

It's up to the malloc implementor how to store this data. Most often, the length is stored directly in front of the allocated memory (that is, if you want to allocate 7 bytes, 7+x bytes are allocated in reality where the x additional bytes are used to store the metadata). Sometimes, the metadata is both stored before and after the allocated memory to check for heap corruptions. But the implementor can as well choose to use an extra data structure to store the metadata.



回答8:

You can allocate more memory to store size:

void my_malloc(size_t n,size_t size ) 
{
void *p = malloc( (n * size) + sizeof(size_t) );
if( p == NULL ) return NULL;
*( (size_t*)p) = n;
return (char*)p + sizeof(size_t);
}
void my_free(void *p)
{
     free( (char*)p - sizeof(size_t) );
}
void my_realloc(void *oldp,size_t new_size)
{
     ...
}
int main(void)
{
   char *p = my_malloc( 20, 1 );
    printf("%lu\n",(long int) ((size_t*)p)[-1] );
   return 0;
}


回答9:

To answer the question about delete[], early versions of C++ actually required that you call delete[n] and tell the runtime the size, so it didn't have to store it. Sadly, this behaviour was removed as "too confusing".

(See D&E for details.)