I am debugging a (native) multi-threaded C++ application under Visual Studio 2008. On seemingly random occasions, I get a "Windows has triggered a break point..." error with a note that this might be due to a corruption in the heap. These errors won't always crash the application right away, although it is likely to crash short after.
The big problem with these errors is that they pop up only after the corruption has actually taken place, which makes them very hard to track and debug, especially on a multi-threaded application.
What sort of things can cause these errors?
How do I debug them?
Tips, tools, methods, enlightments... are welcome.
You can use VC CRT Heap-Check macros for _CrtSetDbgFlag: _CRTDBG_CHECK_ALWAYS_DF or _CRTDBG_CHECK_EVERY_16_DF.._CRTDBG_CHECK_EVERY_1024_DF.
In addition to looking for tools, consider looking for a likely culprit. Is there any component you're using, perhaps not written by you, which may not have been designed and tested to run in a multithreaded environment? Or simply one which you do not know has run in such an environment.
The last time it happened to me, it was a native package which had been successfully used from batch jobs for years. But it was the first time at this company that it had been used from a .NET web service (which is multithreaded). That was it - they had lied about the code being thread safe.
I had a similar problem - and it popped up quite randomly. Perhaps something was corrupt in the build files, but I ended up fixing it by cleaning the project first then rebuilding.
So in addition to the other responses given:
What sort of things can cause these errors? Something corrupt in the build file.
How do I debug them? Cleaning the project and rebuilding. If it's fixed, this was likely the problem.
One quick tip, that I got from Detecting access to freed memory is this:
Application Verifier combined with Debugging Tools for Windows is an amazing setup. You can get both as a part of the Windows Driver Kit or the lighter Windows SDK. (Found out about Application Verifier when researching an earlier question about a heap corruption issue.) I've used BoundsChecker and Insure++ (mentioned in other answers) in the past too, although I was surprised how much functionality was in Application Verifier.
Electric Fence (aka "efence"), dmalloc, valgrind, and so forth are all worth mentioning, but most of these are much easier to get running under *nix than Windows. Valgrind is ridiculously flexible: I've debugged large server software with many heap issues using it.
When all else fails, you can provide your own global operator new/delete and malloc/calloc/realloc overloads -- how to do so will vary a bit depending on compiler and platform -- and this will be a bit of an investment -- but it may pay off over the long run. The desirable feature list should look familiar from dmalloc and electricfence, and the surprisingly excellent book Writing Solid Code:
Note that in our local homebrew system (for an embedded target) we keep the tracking separate from most of the other stuff, because the run-time overhead is much higher.
If you're interested in more reasons to overload these allocation functions/operators, take a look at my answer to "Any reason to overload global operator new and delete?"; shameless self-promotion aside, it lists other techniques that are helpful in tracking heap corruption errors, as well as other applicable tools.
To really slow things down and perform a lot of runtime checking, try adding the following at the top of your
main()
or equivalent in Microsoft Visual Studio C++