I am currently working on a Unit Testing framework where users can create Test Cases and register with the framework.
I would also like to ensure that if any of the User Test Code causes a Crash, it should not Crash the entire framework but should be flagged as failed. To Make this work, I wrote up the following Code so that I can run the Users Code within the Sandbox function
bool SandBox(void *(*fn)(void *),void *arg, void *rc)
{
#ifdef WIN32
__try
{
if (rc)
rc = fn(arg);
else
fn(arg);
return true;
}
__except (EXCEPTION_EXECUTE_HANDLER)
{
return false;
}
#else
#endif
}
This works perfectly on Windows, but I would like my framework to be portable and in order to be so, I would like to ensure a similar functionality for posix environments.
I know C Signal Handlers can intercept an OS Signal, but to translate the Signal Handling Mechanism to an SEH framework has certain challenges that I am unable to solve
- How to continue execution even when my program receives a signal?
- How to Jump the control of execution from the failed location to a block (similar to except) that I can use for error handling?
- How can I clean-up the resources?
Another possibility I was thinking in running the User Test Code on a separate thread with its own signal handler and terminate the thread from the signal handler, but again not sure whether this can possibly work.
So before I think beyond, I would like the help of the community if they are aware of a better solution to tackle this problem/situation.
As you said, you could catch SIGSEGV via signal()
or sigaction()
.
Continuing is not really advisable, as this would be undefined behaviour, i.e. your memory might be corrupted, which might let other test cases fail as well (or even terminate your whole process prematurely).
Would it be possible to run the test cases one by one as a sub process? This way, you could check the exit status and will detect if it terminated cleanly, with an error or due to a signal.
Running the test cases in a separate thread will have the same problem: you do not have memory protection between your test cases and the code driving the test cases.
The suggested approach would be:
fork()
to create a child process.
In the child process, you execve()
your test case. This could be the same binary with different arguments to select a certain test case).
In the parent process, you call waitpid()
to wait for the termination of the test case. You received the pid from the fork()
call in the parent process.
Evaluate the sub-process status with the WIFEXITED, WEXITSTATUS, WIFSIGNALED, WTERMSIG macros.
If you need timeouts for your test cases, you can also install a handler for SIGCHLD. If the timeout elapses first, kill()
the child process. Be aware that you may only call certain functions from signal handlers.
Just a further note: execve()
is not really required. You can just proceed and call your specified testcase directly.
To complement sstn's answer, on Linux, you could have processor and system specific C code which:
- installs a signal handler using sigaction(2) with
SA_SIGINFO
- use the third argument to that signal handler, it is a (machine specific)
ucontext_t*
pointer
analyze the machine specific context state (i.e. the machine registers mcontext_t*
from that ucontext_t*
) - see getcontext(3) for details; by "disassembling" the code pointer you will be able to know which operation failed and you can get the faulting address.
modify and repair that machine state, this means changing the process address space by calling mmap(2) and/or modify some machine registers thru that mcontext_t*
- return from your signal handler into a "repaired" state, perhaps at a different instruction address.
This of course is non portable and painful to code and debug. You may need to disable some compiler optimizations, use asm
instructions or volatile
pointers, etc...
On Debian or Ubuntu see the /usr/include/x86_64-linux-gnu/sys/ucontext.h
header fle.
IIRC some old version of SML/NJ played such tricks.
Read very carefully signal(7) and study the ABI specification for your processor, e.g. the x86-64 ABI specification
In practice, you might also use (more easily) siglongjmp(3) from the signal handler. You might also deliberately violate the signal(7)
rules. You could use Ian Taylor (working on GCC at Google) libbacktrace library, it works better if your applications and its libraries have debug info (e.g. is compiled with g++ -O1 -g2
). See also GNU libc backtrace(3) and dladdr(3)
Handling SIGEGV
is rumored to be not very efficient on Linux. On GNU/Hurd you would use its external pager mechanism.
Another possibility is to run the tested program from the gdb
debugger. Recent versions of gdb
can be scripted in Python, so you could automate a lot of things. This might be practically the most portable approach (since recent gdb
has been ported on many systems).
addenda
Recent (june 2016) 4.6 or future or patched kernels might be able to handle page faults in user space and notably userfaultfd
; but I don't know much the details. See also this question.