Segmentation fault handling

2019-01-07 15:20发布

问题:

I have an application which I use to catch any segmentation fault or ctrl-c. Using the below code, I am able to catch the segmentation fault but the handler is being called again and again. How can I stop them. For your information, I don't want to exit my application. I just can take care to free all the corrupted buffers.

Is it possible?

void SignalInit(void )
{

struct sigaction sigIntHandler;

sigIntHandler.sa_handler = mysighandler;
sigemptyset(&sigIntHandler.sa_mask);
sigIntHandler.sa_flags = 0;
sigaction(SIGINT, &sigIntHandler, NULL);
sigaction(SIGSEGV, &sigIntHandler, NULL);

}

and handler goes like this.

void mysighandler()
{
MyfreeBuffers(); /*related to my applciation*/
}

Here for Segmentation fault signal, handler is being called multiple times and as obvious MyfreeBuffers() gives me errors for freeing already freed memory. I just want to free only once but still dont want to exit application.

Please help.

回答1:

The default action for things like SIGSEGV is to terminate your process but as you've installed a handler for it, it'll call your handler overriding the default behavior. But the problem is segfaulting instruction may be retried after your handler finishes and if you haven't taken measures to fix the first seg fault, the retried instruction will again fault and it goes on and on.

So first spot the instruction that resulted in SIGSEGV and try to fix it (you can call something like backtrace() in the handler and see for yourself what went wrong)

Also, the POSIX standard says that,

The behavior of a process is undefined after it returns normally from a signal-catching function for a [XSI] SIGBUS, SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(), [RTS] sigqueue(), or raise().

So, the ideal thing to do is to fix your segfault in the first place. Handler for segfault is not meant to bypass the underlying error condition

So the best suggestion would be- Don't catch the SIGSEGV. Let it dump core. Analyze the core. Fix the invalid memory reference and there you go!



回答2:

If the SIGSEGV fires again, the obvious conclusion is that the call to MyfreeBuffers(); has not fixed the underlying problem (and if that function really does only free() some allocated memory, I'm not sure why you would think it would).

Roughly, a SIGSEGV fires when an attempt is made to access an inaccessible memory address. If you are not going to exit the application, you need to either make that memory address accessible, or change the execution path with longjmp().



回答3:

I do not agree at all with the statement "Don't catch the SIGSEGV".

That's a pretty good pratice to deal with unexpected conditions. And that's much cleaner to cope with NULL pointers (as given by malloc failures) with signal mechanism associated to setjmp/longjmp, than to distribute error condition management all along your code.

Note however that if you use ''sigaction'' on SEGV, you must not forget to say SA_NODEFER in sa_flags - or find another way to deal with the fact SEGV will trigger your handler just once.

#include <setjmp.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>

static void do_segv()
{
  int *segv;

  segv = 0; /* malloc(a_huge_amount); */

  *segv = 1;
}

sigjmp_buf point;

static void handler(int sig, siginfo_t *dont_care, void *dont_care_either)
{
   longjmp(point, 1);
}

int main()
{
  struct sigaction sa;

  memset(&sa, 0, sizeof(sigaction));
  sigemptyset(&sa.sa_mask);

  sa.sa_flags     = SA_NODEFER;
  sa.sa_sigaction = handler;

  sigaction(SIGSEGV, &sa, NULL); /* ignore whether it works or not */ 

  if (setjmp(point) == 0)
   do_segv();

  else
    fprintf(stderr, "rather unexpected error\n");

  return 0;
}


回答4:

You shouldn't try to continue after SIG_SEGV. It basically means that the environment of your application is corrupted in some way. It could be that you have just dereferenced a null pointer, or it could be that some bug has caused your program to corrupt its stack or the heap or some pointer variable, you just don't know. The only safe thing to do is terminate the program.

It's perfectly legitimate to handle control-C. Lots of applications do it, but you have to be really careful exactly what you do in your signal handler. You can't call any function that's not re-entrant. So that means if your MyFreeBuffers() calls the stdlib free() function, you are probably screwed. If the user hits control-C while the program is in the middle of malloc() or free() and thus half way through manipulating the data structures they use to track heap allocations, you will almost certainly corrupt the heap if you call malloc() or free() in the signal handler.

About the only safe thing you can do in a signal handler is set a flag to say you caught the signal. Your app can then poll the flag at intervals to decide if it needs to perform some action.



回答5:

Well you could set a state variable and only free memory if its not set. The signal handler will be called everytime, you can't control that AFAIK.



回答6:

I can see at case for recovering from a SIG_SEGV, if your handling events in a loop and one of these events causes a Segmentation Violation then you would only want to skip over this event, continue processing the remaining events. In my eyes SIG_SEGV is similar to the NullPointerException in Java. Yes the state will be inconsistent and unknown after either of these, however in some cases you would like to handle the situation and carry on. For instance in Algo trading you would pause the execution of an order and allow a trader to manually take over, with out crashing the entire system and ruining all other orders.



回答7:

Looks like at least under Linux using the trick with -fnon-call-exceptions option can be the solution. It will give an ability to convert the signal to general C++ exception and handle it by general way. Look the linux3/gcc46: "-fnon-call-exceptions", which signals are trapping instructions? for example.