I have a multi-threaded C++ program that deadlocks in some rare cases. The problem is hard to reproduce and I can only reproduce it in a remote machine. The method I want to use for solving this problem is
- run the program
- wait for deadlock
- send abort signal to it for generating core dump
- copy the dump back to my local machine
- use gdb to debug it
I do not have gdb on the remote machine and cannot install anything on it. The problem is when I am debugging the core dump (obtained from either a dead-locked or normally running process on the remote machine), the back-trace of most of the threads show only:
(gdb) bt #0 pthread_cond_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:261 #1 0x0000000000000000 in ?? ()
I am using a statically linked binary which is compiled with "-g -O1" options. When I abort a process of the same binary on my local machine, gdb can extract the entire stack from core dump and there is no such problem (I cannot reproduce the deadlock however). My remote machine is SLES and my local machine is ubuntu.
Any idea?
Edit:
Found someone else with the same problem, but still with no solutions: http://groups.google.com/group/google-coredumper/browse_thread/thread/2ca9bcf9465d1050 (I am not using google coredumper, but it seems like google coredumper fails with the same error, this suggests that perhaps the problem is with SLES 11)