Linux: How to debug a SIGSEGV? How do I trace the

2020-06-03 06:33发布

问题:

My firefox started crashing since today. I haven't changed anything on the system or on firefox config.

I use
strace -ff -o dumpfile.txt firefox
to trace the problem. It's not a big help.

I see the segfault, in two of the generated process dumps, but how I can trace them to their cause?

After running for 10 seconds and crashing, 22MB of data is generated by strace.

This is a snippet of the output, where you can see actual SIGSEGV in the middle.:

read(19, "\372", 1)                     = 1
gettimeofday({1245590019, 542231}, NULL) = 0
read(3, "\6\0[Qmy\26\0\3\1\0\0Y\0\200\2\0\0\0\0\323\3A\0\323\3(\0\20\0\1\0", 4096) = 32
read(3, 0xf5c55058, 4096)               = -1 EAGAIN (Resource temporarily unavailable)
gettimeofday({1245590019, 542813}, NULL) = 0
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=8, events=POLLIN|POLLPRI}, {fd=12, events=POLLIN|POLLPRI}, {fd=13, events=POLLIN|POLLPRI}, {fd=14, events=POL
read(3, 0xf5c55058, 4096)               = -1 EAGAIN (Resource temporarily unavailable)
gettimeofday({1245590019, 543161}, NULL) = 0
gettimeofday({1245590019, 546672}, NULL) = 0
gettimeofday({1245590019, 546761}, NULL) = 0
read(3, 0xf5c55058, 4096)               = -1 EAGAIN (Resource temporarily unavailable)
gettimeofday({1245590019, 546936}, NULL) = 0
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=8, events=POLLIN|POLLPRI}, {fd=12, events=POLLIN|POLLPRI}, {fd=13, events=POLLIN|POLLPRI}, {fd=14, events=POL
poll([{fd=3, events=POLLIN|POLLOUT}], 1, 4294967295) = 1 ([{fd=3, revents=POLLOUT}])
writev(3, [{"5\30\4\0006\21\200\2\266\n\200\2\17\0]\3\230\4\5\0007\21\200\0026\21\200\2\317\0\0\0"..., 1624}, {NULL, 0}, {"", 0}], 3) = 1624
poll([{fd=3, events=POLLIN}], 1, 4294967295) = 1 ([{fd=3, revents=POLLIN}])
read(3, "\1\30\224Q\17\17\0\0\0\0\0\0\0\0\0\0000\235\273\0\0\0\0\0o\264Q\0\0\0\0\0"..., 4096) = 4096
read(3, "\375\240f\0\376\242j\0\377\261\200\0\271a+\0\271a+\0\377\261\200\0\376\252w\0\376\250s\0"..., 11356) = 11356
read(3, 0xf5c55058, 4096)               = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=3, events=POLLIN|POLLOUT}], 1, 4294967295) = 1 ([{fd=3, revents=POLLOUT}])
writev(3, [{"\230\32\7\0\1\21\200\2?\21\200\2\377\377\377\377\377\377\377\377\0\0\0\0\17\0\1\0015\10\4\0"..., 956}, {NULL, 0}, {"", 0}], 3) = 956
poll([{fd=3, events=POLLIN}], 1, 4294967295) = 1 ([{fd=3, revents=POLLIN}])
read(3, "\1\30\256Q\17\17\0\0\0\0\0\0\0\0\0\0000\235\273\0\0\0\0\0o\264Q\0\0\0\0\0"..., 4096) = 4096
read(3, "\375\240f\0\376\242j\0\377\261\200\0\271a+\0\271a+\0\377\261\200\0\376\252w\0\376\250s\0"..., 11356) = 11356
read(3, 0xf5c55058, 4096)               = -1 EAGAIN (Resource temporarily unavailable)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
unlink("/home/userrrr/.mozilla/firefox/mvbnkitl.default/lock") = 0
rt_sigaction(SIGSEGV, {SIG_DFL, ~[HUP INT QUIT ABRT BUS FPE KILL PIPE CHLD CONT TTOU URG XCPU WINCH RT_1 RT_2 RT_3 RT_4 RT_8 RT_11 RT_14 RT_17 RT_22], SA_NOCLDSTOP},
rt_sigprocmask(SIG_BLOCK, ~[ILL ABRT BUS FPE SEGV RTMIN RT_1], ~[KILL STOP RTMIN RT_1], 8) = 0
open("/home/userrrr/.mozilla/firefox/mvbnkitl.default/minidumps/56b30367-5ee2-0495-32646b7f-59dc87e9.dmp", O_WRONLY|O_CREAT|O_EXCL, 0600) = 63
clone(child_stack=0xf5bfffe4, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_UNTRACED) = 18929
waitpid(18929, NULL, __WALL) = 18929
open("/proc/18913/task", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 64
fstat64(64, {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
getdents64(64, /* 12 entries */, 1024)  = 368
ptrace(PTRACE_DETACH, 18913, 0, SIG_0)  = -1 ESRCH (No such process)
close(64)                               = 0
ftruncate(63, 91256)                    = 0
close(63)                               = 0
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1], ~[KILL STOP RTMIN RT_1], 8) = 0
time(NULL)                              = 1245590020
open("/home/userrrr/.mozilla/firefox/Crash Reports/LastCrash", O_WRONLY|O_CREAT|O_TRUNC, 0600) = 63
write(63, "1245590020", 10)             = 10

回答1:

Ivan, your real question is "how do I debug a SIGSEGV?"

strace is rarely a good help here. SIGSEGV means that the application tried to dereference (access) a location in memory which which hasn't been allocated (or not allowed to be dereferenced for various other reasons). Chances are high that it is not related to the system calls activity which strace is capturing. In order to discover the cause of your crash, start by understanding what address is being dereferenced and what function tries to do that. Debugger is the right tool for this task.

Here's what you need to do:

 gdb <your_app_name> <your_coredump_file>

in there, analyzing the last executed instruction and using "info registers" you'll see the address in question. Using the "bt" command you'll see the callstack. By walking the callstack up, you'll discover how the incorrect address is being calculated. One of the steps involved in this address calculation is the cause of your problem.

Debugging is fun and this is a good opportunity to delve into it. A good book or some online articles can help you there. Google away and good luck!



回答2:

You can startup firefox in debug mode with this: firefox -d gdb

this will start firefox inside gdb.

You can the issue the gdb command 'run' and get a traceback when firefox crashes. This may be difficult, as firefox ships with stripped libraries which only shows which library and offset the code is in, not the function names.

Another alternative is to start firefox in safemode: firefox -safe-mode and turn off any plugins you may have installed until it doesn't crash anymore.

The last alternative is to enable the developer mode of firefox, and allow it to send the firefox crash session to the mozilla server. Then you can go the to mozilla site and see the detailed traceback of your failed firefox session.