I read somewhere that it is disastrous to use free
to get rid of an object not created by calling malloc
, is this true? why?
问题:
回答1:
That's undefined behavior - never try it.
Let's see what happens when you try to free()
an automatic variable. The heap manager will have to deduce how to take ownership of the memory block. To do so it will either have to use some separate structure that lists all allocated blocks and that is very slow an rarely used or hope that the necessary data is located near the beginning of the block.
The latter is used quite often and here's how i is supposed to work. When you call malloc() the heap manager allocates a slightly bigger block, stores service data at the beginning and returns an offset pointer. Smth like:
void* malloc( size_t size )
{
void* block = tryAlloc( size + sizeof( size_t) );
if( block == 0 ) {
return 0;
}
// the following is for illustration, more service data is usually written
*((size_t*)block) = size;
return (size_t*)block + 1;
}
then free()
will try to access that data by offsetting the passed pointer but if the pointer is to an automatic variable whatever data will be located where it expects to find service data. Hence undefined behavior. Many times service data is modified by free()
for heap manager to take ownership of the block - so if the pointer passed is to an automatic variable some unrelated memory will be modified and read from.
Implementations may vary but you should never make any specific assumptions. Only call free()
on addresses returned by malloc()
family functions.
回答2:
By the standard, it's "undefined behavior" - i.e. "anything can happen". That's usually bad things, though.
In practice: free
'ing a pointer means modifying the heap. C runtime does virtually never validate if the pointer passed comes from the heap - that would be to costly in either time or memory. Combine these two factoids, and you get "free(non-malloced-ptr) will write something somewhere" - the resutl may be some of "your" data modified behind your back, an access violation, or trashing vital runtime structures, such as a return address on the stack.
Example: A disastrous scenario:
Your heap is implemented as a simple list of free blocks. malloc means removing a suitable block from the list, free means adding it to the list again. (a typical if trivial implementation)
You free() a pointer to a local variable on the stack. You are "lucky" because the modification goes into irrelevant stack space. However, part of the stack is now on your free list.
Because of the allocator design and your allocation patterns, malloc is unlikely to return this block. Later, in an completely unrelated part of the program, you actually do get this block as malloc result, writing to it trashes some local variables up the stack, and when returning some vital pointer contains garbage and your app crashes. Symptoms, repro and location are completely unrelated to the actual cause.
Debug that.
回答3:
It is undefined behaviour. And logically, if behaviour is undefined, you cannot be sure what has happened, and if the program is still operating properly.
回答4:
Some people have pointed out here that this is "undefined behavior". I'm going to go farther and say that on some implementations, this will either crash your program or cause data corruption. It has to do with how "malloc" and "free" are implemented.
One possible way to implement malloc/free is to put a small header before each allocated region. On a malloc'd region, that header would contain the size of the region. When the region is freed, that header is checked and the region is added to the appropriate freelist. If this happens to you, this is bad news. For example, if you free an object allocated on the stack, suddenly part of the stack is in the freelist. Then malloc might return that region in response to a future call, and you'll scribble data all over your stack. Another possibility is that you free a string constant. If that string constant is in read-only memory (it often is), this hypothetical implementation would cause a segfault and crash either after a later malloc or when free adds the object to its freelist.
This is a hypothetical implementation I am talking about, but you can use your imagination to see how it could go very, very wrong. Some implementations are very robust and are not vulnerable to this precise type of user error. Some implementations even allow you to set environment variables to diagnose these types of errors. Valgrind and other tools will also detect these errors.
回答5:
Strictly speaking, this is not true. calloc() and realloc() are valid object sources for free(), too. ;)
回答6:
It would certainly be possible for an implementation of malloc
/free
to keep a list of the memory blocks thats been allocated and in the case the user tries to free a block that isn't in this list do nothing.
However since the standard says that this isn't a requirement most implementation will treat all pointers coming into free as valid.
回答7:
Please have a look at what undefined behavior means. malloc()
and free()
on a conforming hosted C implementation are built to standards. The standards say the behavior of calling free()
on a heap block that was not returned by malloc()
(or something wrapping it, e.g. calloc()
) is undefined.
This means, it can do whatever you want it to do, provided that you make the necessary modifications to free()
on your own. You won't break the standard by making the behavior of free()
on blocks not allocated by malloc()
consistent and even possibly useful.
In fact, there could be platforms that (themselves) define this behavior. I don't know of any, but there could be some. There are several garbage collecting / logging malloc() implementations that might let it fail more gracefully while logging the event. But thats implementation , not standards defined behavior.
Undefined simply means don't count on any kind of consistent behavior unless you implement it yourself without breaking any defined behavior. Finally, implementation defined does not always mean defined by the host system. Many programs link against (and ship) uclibc. In that case, the implementation is self contained, consistent and portable.