I am trying to handle a SIGFPE
signal but my program just crashes or runs forever. I HAVE to use signal()
and not the other ones like sigaction()
.
So in my code I have:
#include <stdio.h>
#include <signal.h>
void handler(int signum)
{
// Do stuff here then return to execution below
}
int main()
{
signal(SIGFPE, handler);
int i, j;
for(i = 0; i < 10; i++)
{
// Call signal handler for SIGFPE
j = i / 0;
}
printf("After for loop");
return 0;
}
Basically, I want to go into the handler every time there is a division by 0. It should do whatever it needs to inside the handler()
function then continue the next iteration of the loop.
This should also work for other signals that need to be handled. Any help would be appreciated.
If you have to use signal to handle FPE or any other signal that you cause directly by invoking the CPU nonsense that causes it, it is only defined what happens if you either exit the program from the signal handler or use longjmp to get out.
Also note the exact placement of the restore functions, at the end of the computation branch but at the start of the handle branch.
Unfortunately, you can't use signal() like this at all; the second invocation causes the code to fall down. You must use sigaction if you intend to handle the signal more than once.
Caveat: Sorry to rain on the parade, but you really don't want to do this.
It is perfectly valid to trap [externally generated] signals like
SIGINT, SIGTERM, SIGHUP
etc. to allow graceful cleanup and termination of a program that may have files open that are partially written to.However, internally generated signals, such as
SIGILL, SIGBUS, SIGSEGV
andSIGFPE
are very hard to recover from meaningfully. The first three are bugs--pure and simple. And, IMO, theSIGFPE
is also a hard bug as well.After such a signal, your program is in an unsafe and indeterminate state. Even trapping the signal and doing
longjmp/siglongjmp
doesn't fix this.And, there is no way to tell exactly how bad the damage is. Or, how bad the damage will become if the program tries to proceed.
If you get
SIGFPE
, was it for a floating point calculation [which you might be able to smooth over]. Or, was it for integer divide-by-zero? What calculation was being done? And, where? You don't know.Trying to continue can sometimes cause 10x the damage because now the program is out of control. After recovery, the program may be okay, but it may not be. So, the reliability of the program after the event, can not be determined with any degree of certainty.
What were the events (i.e.) calculations that led up to the
SIGFPE
? Maybe, it's not merely a single divide, but the chain of calculations that led up to the value being zero. Where did these values go? Will these now suspect values be used by code after the recovery operation has taken place?For example, the program might overwrite the wrong file because the failed calculation was somehow involved in selecting the file descriptor that a caller is going to use.
Or, you leak memory. Or, corrupt the heap. Or, was the error within the heap allocation code itself?
Consider the following function:
Even with a signal handler that does
siglongjmp
, the file thatmyfunc
was writing to is now corrupted/truncated. And, the file descriptor won't be closed.Or, what if
myfunc
was reading from the file and saving the data to some array. That array is only partially filled. Now, you getSIGFPE
. This is intercepted by the signal handler which doessiglongjmp
.One of the callers of
myfunc
does thesigsetjmp
to "catch" this. But, what can it do? The caller has no idea how bad things are. It might assume that the buffermyfunc
was reading into is fully formed and write it out to a different file. That other file has now become corrupted.UPDATE:
Oops, forgot to mention undefined behavior ...
Normally, we associate UB, such as writing past the end of an array, with a segfault [
SIGSEGV
]. But, what if it causesSIGFPE
instead?It's no longer just a "bad calculation" -- we're trapping [and ignoring] UB at the earliest detection point. If we do recovery, the next usage could be worse.
Here's an example: