How do free and malloc work in C?

2019-01-01 12:42发布

I'm trying to figure out what would happened if I try to free a pointer "from the middle" for example, look at the following code:

char *ptr = (char*)malloc(10*sizeof(char));

for (char i=0 ; i<10 ; ++i)
{
    ptr[i] = i+10;
}
++ptr;
++ptr;
++ptr;
++ptr;
free(ptr);

I get a crash with an Unhandled exception error msg. I want to understand why and how free works so that I know not only how to use it but also be able to understand weird errors and exceptions and better debug my codeץ

Thanks a lot

8条回答
泛滥B
2楼-- · 2019-01-01 12:51

You're freeing the wrong address. By changing the value of ptr, you change the address. free has no way of knowing that it should try to free a block starting 4 bytes back. Keep the original pointer intact and free that instead of the manipulated one. As others pointed out, the results of doing what you're doing are "undefined"... hence the unhandled exception.

查看更多
只靠听说
3楼-- · 2019-01-01 12:54

Most (if not all) implementation will lookup the amount of data to free a few bytes before the actual pointer you are manipulating. Doing a wild free will lead to memory map corruption.

If your example, when you allocate 10 bytes of memory, the system actually reserve, let's say, 14. The first 4 contains the amount of data you requested (10) and then the return value of the malloc is a pointer to the first byte of unused data in the 14 allocated.

When you call free on this pointer, the system will lookup 4 bytes backwards to know that it originally allocated 14 bytes so that it knows how much to free. This system prevents you from providing the amount of data to free as an extra parameter to free itself.

Of course, other implementation of malloc/free can choose other way to achieve this. But they generally don't support to free on a different pointer than what was returned by malloc or equivalent function.

查看更多
闭嘴吧你
4楼-- · 2019-01-01 12:59

When you malloc a block, it actually allocates a bit more memory than you asked for. This extra memory is used to store information such as the size of the allocated block, and a link to the next free/used block in a chain of blocks, and sometimes some "guard data" that helps the system to detect if you write past the end of your allocated block. Also, most allocators will round up the total size and/or the start of your part of the memory to a multiple of bytes (e.g. on a 64-bit system it may align the data to a multiple of 64 bits (8 bytes) as accessing data from non-aligned addresses can be more difficult and inefficient for the processor/bus), so you may also end up with some "padding" (unused bytes).

When you free your pointer, it uses that address to find the special information it added to the beginning (usually) of your allocated block. If you pass in a different address, it will access memory that contains garbage, and hence its behaviour is undefined (but most frequently will result in a crash)

Later, if you free() the block but don't "forget" your pointer, you may accidentally try to access data through that pointer in the future, and the behaviour is undefined. Any of the following situations might occur:

  • the memory might be put in a list of free blocks, so when you access it, it still happens to contain the data you left there, and your code runs normally.
  • the memory allocator may have given (part of) the memory to another part of your program, and that will presumably have then overwritten (some of) your old data, so when you read it, you'll get garbage which might cause unexpected behaviour or crashes from your code. Or you will write over the other data, causing the other part of your program to behave strangely at some point in the future.
  • the memory could have been returned to the operating system (a "page" of memory that you're no longer using can be removed from your address space, so there is no longer any memory available at that address - essentially an unused "hole" in your application's memory). When your application tries to access the data a hard memory fault will occur and kill your process.

This is why it is important to make sure you don't use a pointer after freeing the memory it points at - the best practice for this is to set the pointer to NULL after freeing the memory, because you can easily test for NULL, and attempting to access memory via a NULL pointer will cause a bad but consistent behaviour, which is much easier to debug.

查看更多
何处买醉
5楼-- · 2019-01-01 12:59

From http://opengroup.org/onlinepubs/007908775/xsh/free.html

The free() function causes the space pointed to by ptr to be deallocated; that is, made available for further allocation. If ptr is a null pointer, no action occurs. Otherwise, if the argument does not match a pointer earlier returned by the calloc(), malloc(), realloc() or valloc() function, or if the space is deallocated by a call to free() or realloc(), the behaviour is undefined. Any use of a pointer that refers to freed space causes undefined behaviour.

查看更多
余生请多指教
6楼-- · 2019-01-01 13:02

That's undefined behaviour - don't do it. Only free() pointers obtained from malloc(), never adjust them prior to that.

The problem is free() must be very fast, so it doesn't try to find the allocation your adjusted address belongs to, but instead tries to return the block at exactly the adjusted address to the heap. That leads to undefined behaviour - usually heap corruption or crashing the program.

查看更多
闭嘴吧你
7楼-- · 2019-01-01 13:10

You probably know that you are supposed to pass back exactly the pointer you received.

Because free() does not at first know how big your block is, it needs auxiliary information in order to identify the original block from its address and then return it to a free list. It will also try to merge small freed blocks with neighbors in order to produce a more valuable large free block.

Ultimately, the allocator must have metadata about your block, at a minimum it will need to have stored the length somewhere.

I will describe three ways to do this.

  • One obvious place would be to store it just before the returned pointer. It could allocate a block that is a few bytes larger than requested, store the size in the first word, then return to you a pointer to the second word.

  • Another way would be to keep a separate map describing at least the length of allocated blocks, using the address as a key.

  • An implementation could derive some information from the address and some from a map. The 4.3BSD kernel allocator (called, I think, the "McKusick-Karel allocator") makes power-of-two allocations for objects of less than page size and keeps only a per-page size, making all allocations from a given page of a single size.

It would be possible with some types of the second and probably any kind of the third type of allocator to actually detect that you have advanced the pointer and DTRT, although I doubt if any implementation would burn the runtime to do so.

查看更多
登录 后发表回答