handle SIGSEGV in Linux?

2019-05-31 19:19发布

问题:

I need handle the SIGSEGV in my Linux app. The reason is some clean up(3-partry lib) must be done before generate core-dump. What is more, the clean up must be performed in the context of calling thread, cannot do in signal handler. So I plan in signal handler to pass the control to the calling thread, after the clean up finished, then use raise(SIGSEGV) to generate the core-dump.

The real problem seems the signal_handler cannot pass the control to calling thread, no matter I use post_sem or some others. Any idea to handle this case? Possbile to hijack the SIGSEGV, then in SIGSEGV hander return to another thread to perform some clean up?

signal(SIGSEGV, signal_handler);

signal_handler() { ... post_sem(); ... }

calling thread() { wait_sem(); clean_up(); ... }

回答1:

You want to cleanup after a SIGSEGV (i.e. serious error)... I find this a little weird because, 1) if you were debugging the application you should leave everything intact to be stored in the core file so you can accurately identify what happened and 2) if you have a release application for a customer (let's say) well...it shouldn't SIGSEGV :) (not my problem anyway, just saying..)

On topic,

I think you could try to block SIGSEGV in all threads except the one in which you are trying to do the cleanup; this should make the os deliver the signal to that specific thread. Other solution I could think of is something along the lines of setjmp() / longjmp() (haven't tested any of these though).

Be careful that once your program got a SEGV, you're on shaky ground (i.e. your cleanup might fail as well and generate another SEGV etc etc) so you should consider just crashing with a core.



回答2:

Its not possible to reliable run any code after you have encountered a SIGSEGV. You might get away with it sometimes but you can't trust your program to work as intended afterwards. If for example you have a SIGSEGV because of a corrupted heap you will have problems if your 3rd party lib is cleaning up any memory.

To get a reliable solution i would think about if you really need to run that cleanup code or if there is another way to handle the situation (check for an unclean shutdown on the next startup, ...).



回答3:

I think you should not try to run anything which cleans up in-memory state, especially writing it to disc, as if successful, you make cause data corruption.

Recording some state information, if possible, may aid debugging, but you should not rely on being able to.

Instead, the program should, after logging state information, either return to the default handler (and dump core etc), or call _exit and quit without any cleanup whatsoever.

If you need to do cleanup work to start again after a crash, do it on the next startup instead.