I have a multithreaded C program, which consistently generates a segmentation fault at a specific point in the program. When I run it with gdb, no fault is shown. Can you think of any reason why the fault might occur only when not using the debugger? It's pretty annoying not being able to use it to find the problem!
问题:
回答1:
Classic Heisenbug. From Wikipedia:
Time can also be a factor in heisenbugs. Executing a program under control of a debugger can change the execution timing of the program as compared to normal execution. Time-sensitive bugs such as race conditions may not reproduce when the program is slowed down by single-stepping source lines in the debugger. This is particularly true when the behavior involves interaction with an entity not under the control of a debugger, such as when debugging network packet processing between two machines and only one is under debugger control.
The debugger may be changing timing, and hiding a race condition.
On Linux, GDB also disables address space randomization, and your crash may be specific to address space layout. Try (gdb) set disable-randomization off
.
Finally, ulimit -c unlimited
and post-mortem debugging (already suggested by Robie) may work.
回答2:
Perhaps when using gdb
memory is mapped in a location which your over/under flow doesn't trample on memory that causes a crash. Or it could be a race condition that is no longer getting tripped. Although it sounds unintuitive, you should be happy your program was nice enough to crash on you.
Some suggestions
- Try a static code analyzer such as the free cppcheck
- Try a malloc() debugger like libefence
- Try running it through valgrind
回答3:
By debugging it you are changing the environment that it is running in. It sounds like you are dealing with some sort of race condition, and by debugging it things are scheduled slightly differently so you don't encounter the issue. That, or things are being stored in a slightly different way so it doesn't occur. Are you able to put some debugging output in the code to assist in figuring out the problem? That may have less of an impact and allow you to find your issue.
回答4:
I have totally had this problem before! It was a race condition, and when I was stepping though the code with a debugger the thread i was in was slow enough to not trigger the race condition. Pretty awful.