I can't sleep! :)
I have a reasonably large project on Windows and encountered some heap corruption issues. I have read all SO, including this nice topic: How to debug heap corruption errors?, however nothing was suitable to help me out-of-the-box. Debug CRT
and BoundsChecker
detected heap corruptions, but addresses were always different and detections point were always far away from the actual memory overwrites. I have not slept till the middle of the night and crafted the following hack:
DWORD PageSize = 0;
inline void SetPageSize()
{
if ( !PageSize )
{
SYSTEM_INFO sysInfo;
GetSystemInfo(&sysInfo);
PageSize = sysInfo.dwPageSize;
}
}
void* operator new (size_t nSize)
{
SetPageSize();
size_t Extra = nSize % PageSize;
nSize = nSize + ( PageSize - Extra );
return Ptr = VirtualAlloc( 0, nSize, MEM_COMMIT, PAGE_READWRITE);
}
void operator delete (void* pPtr)
{
MEMORY_BASIC_INFORMATION mbi;
VirtualQuery(pPtr, &mbi, sizeof(mbi));
// leave pages in reserved state, but free the physical memory
VirtualFree(pPtr, 0, MEM_DECOMMIT);
DWORD OldProtect;
// protect the address space, so noone can access those pages
VirtualProtect(pPtr, mbi.RegionSize, PAGE_NOACCESS, &OldProtect);
}
Some heap corruption errors became obvious and i was able to fix them. There were no more Debug CRT warnings on exit. However, i have some questions regarding this hack:
1. Can it produce any false positives?
2. Can it miss some of the heap corruptions? (even if we replace malloc/realloc/free?)
3. It fails to run on 32-bits with OUT_OF_MEMORY
, only on 64-bits. Am I right we simply run out of the virtual address space on 32-bits?
This won't catch:
Ideally, you should write a well-known bit pattern before and after your allocated blocks, so that
operator delete
can check whether they were overwritten (indicated buffer over- or under-run).Currently this would be allowed silently in your scheme, and switching back to
malloc
etc. would allow it to silently damage the heap, and show up as an error later on (eg. when freeing the block after the over-run one).You can't catch everything though: note for example that if the underlying problem is (valid) pointer somewhere getting overwritten with garbage, you can't detect this until the damaged pointer is de-referenced.
So, this will only catch bugs of the class "use after free()". For that purpose, I think, it's reasonably good.
If you try to
delete
something that wasn'tnew
'ed, that's a different type of bug. Indelete
you should first check if the memory has been indeed allocated. You shouldn't be blindly freeing the memory and marking it as inaccessible. I'd try to avoid that and report (by, say, doing a debug break) when there's an attempt todelete
something that shouldn't be deleted because it was nevernew
'ed.Obviously, this won't catch all corruptions of heap data between
new
and and the respectivedelete
. It will only catch those attempted afterdelete
.E.g.:
Typically you have available about ~2GB of the virtual address space on a 32-bit Windows. That's good for at most ~524288
new
's like in the provided code. But with objects bigger than 4KB, you'll be able to successfully allocate fewer instances than that. And then address space fragmentation will reduce that number further.It's a perfectly expected outcome if you create many object instances during the life cycle of your program.
Yes, your current answer can miss heap corruptions of buffer under- and overruns.
Your delete() function is pretty good!
I implemented a new() function in similar manner, that adds guard pages both for under- and overruns.
From GFlags documentation I conclude that it protects only against overruns.
Note that when returning simply a pointer next to the underrun guard page then guard page for overruns is likely to be located away from the allocated object and immediate vicinity after the allocated object is NOT guarded.
To compensate for this one would need to return such a pointer that the object is located immediately before overrun guard page (in this case again an underrun is less likely to be detected).
The below code does one or the other alternately for each call of new(). Or one might want to modify it to use threadsafe random generator instead to prevent any interferences with code calling the new().
Considering all this one should be aware that detecting under- and overruns by the below code is still probabilistic to a degree - this is especially relevant in the case when some objects are allocated only once for the entire duration of the program.
NB! Because new() returns a modified aadress, the delete() function also had to be adjusted a bit, so it now uses mbi.AllocationBase instead of ptr for VirtualFree() and VirtualProtect().
PS. Driver Verifier's Special Pool uses similar tricks.