This is a continuation of How to prevent SIGINT in child process from propagating to and killing parent process?
In the above question, I learned that SIGINT
wasn't being bubbled up from child to parent, but rather, is issued to the entire foreground process group, meaning I needed to write a signal handler to prevent the parent from exiting when I hit CTRL + C
.
I tried to implement this, but here's the problem. Regarding specifically the kill
syscall I invoke to terminate the child, if I pass in SIGKILL
, everything works as expected, but if I pass in SIGTERM
, it also terminates the parent process, showing Terminated: 15
in the shell prompt later.
Even though SIGKILL works, I want to use SIGTERM is because it seems just like a better idea in general from what I've read about it giving the process it's signaling to terminate a chance to clean itself up.
The below code is a stripped down example of what I came up with
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
pid_t CHILD = 0;
void handle_sigint(int s) {
(void)s;
if (CHILD != 0) {
kill(CHILD, SIGTERM); // <-- SIGKILL works, but SIGTERM kills parent
CHILD = 0;
}
}
int main() {
// Set up signal handling
char str[2];
struct sigaction sa = {
.sa_flags = SA_RESTART,
.sa_handler = handle_sigint
};
sigaction(SIGINT, &sa, NULL);
for (;;) {
printf("1) Open SQLite\n"
"2) Quit\n"
"-> "
);
scanf("%1s", str);
if (str[0] == '1') {
CHILD = fork();
if (CHILD == 0) {
execlp("sqlite3", "sqlite3", NULL);
printf("exec failed\n");
} else {
wait(NULL);
printf("Hi\n");
}
} else if (str[0] == '2') {
break;
} else {
printf("Invalid!\n");
}
}
}
My educated guess as to why this is happening would be something intercepts the SIGTERM, and kills the entire process group. Whereas, when I use SIGKILL, it can't intercept the signal so my kill call works as expected. That's just a stab in the dark though.
Could someone explain why this is happening?
As I side note, I'm not thrilled with my handle_sigint
function. Is there a more standard way of killing an interactive child process?
You have too many bugs in your code (from not clearing the signal mask on the
struct sigaction
) for anyone to explain the effects you are seeing.Instead, consider the following working example code, say
example.c
:Compile it using e.g.
and run using e.g.
You'll notice that Ctrl+C does not interrupt
sqlite3
-- but then again, it does not even if you were to runsqlite3
directly --; instead, you just see^C
on screen. This is becausesqlite3
sets up the terminal in such a way that Ctrl+C does not cause a signal, and is just interpreted as normal input.You can exit from
sqlite3
using the.quit
command, or pressing Ctrl+D at the start of a line.You'll see that the original program will output a
Command ... []
line afterwards, before returning you to the command line. Thus, the parent process is not killed/harmed/bothered by the signals.You can use
ps f
to look at a tree of your terminal processes, and that way find out the PIDs of the parent and child processes, and send signals to either one to observe what happens.Note that because
SIGSTOP
signal cannot be caught, blocked, or ignored, it would be nontrivial to reflect the job control signals (as in when you use Ctrl+Z). For proper job control, the parent process would need to set up a new session and a process group, and temporarily detach from the terminal. That too is quite possible, but a bit beyond the scope here, as it involves quite detailed behaviour of sessions, process groups, and terminals, to manage correctly.Let's deconstruct the above example program.
The example program itself first installs some signal reflectors, then forks a child process, and that child process executes the command
sqlite3
. (You can speficy any executable and any parameters strings to the program.)The
internal_child_pid
variable, andset_child_pid()
andget_child_pid()
functions, are used to manage the child process atomically. The__atomic_store_n()
and__atomic_load_n()
are compiler-provided built-ins; for GCC, see here for details. They avoid the problem of a signal occurring while the child pid is only partially assigned. On some common architectures this cannot occur, but this is intended as a careful example, so atomic accesses are used to ensure only a completely (old or new) value is ever seen. We could avoid using these completely, if we blocked the related signals temporarily during the transition instead. Again, I decided the atomic accesses are simpler, and might be interesting to see in practice.The
forward_handler()
function obtains the child process PID atomically, then verifies it is nonzero (that we know we have a child process), and that we are not forwarding a signal sent by the child process (just to ensure we don't cause a signal storm, the two bombarding each other with signals). The various fields in thesiginfo_t
structure are listed in theman 2 sigaction
man page.The
forward_signal()
function installs the above handler for the specified signalsignum
. Note that we first usememset()
to clear the entire structure to zeros. Clearing it this way ensures future compatibility, if some of the padding in the structure is converted to data fields.The
.sa_mask
field in thestruct sigaction
is an unordered set of signals. The signals set in the mask are blocked from delivery in the thread that is executing the signal handler. (For the above example program, we can safely say that these signals are blocked while the signal handler is run; it's just that in multithreaded programs, the signals are only blocked in the specific thread that is used to run the handler.)It is important to use
sigemptyset(&act.sa_mask)
to clear the signal mask. Simply setting the structure to zero does not suffice, even if it works (probably) in practice on many machines. (I don't know; I haven't even checked. I prefer robust and reliable over lazy and fragile any day!)The flags used includes
SA_SIGINFO
because the handler uses the three-argument form (and uses thesi_pid
field of thesiginfo_t
).SA_RESTART
flag is only there because the OP wished to use it; it simply means that if possible, the C library and the kernel try to avoid returningerrno == EINTR
error if a signal is delivered using a thread currently blocking in a syscall (likewait()
). You can remove theSA_RESTART
flag, and add a debuggingfprintf(stderr, "Hey!\n");
in a suitable place in the loop in the parent process, to see what happens then.The
sigaction()
function will return 0 if there is no error, or-1
witherrno
set otherwise. Theforward_signal()
function returns 0 if theforward_handler
was assigned successfully, but a nonzero errno number otherwise. Some do not like this kind of return value (they prefer just returning -1 for an error, rather than theerrno
value itself), but I'm for some unreasonable reason gotten fond of this idiom. Change it if you want, by all means.Now we get to
main()
.If you run the program without parameters, or with a single
-h
or--help
parameter, it'll print an usage summary. Again, doing this this way is just something I'm fond of --getopt()
andgetopt_long()
are more commonly used to parse command-line options. For this kind of trivial program, I just hardcoded the parameter checks.In this case, I intentionally left the usage output very short. It would really be much better with an additional paragraph about exactly what the program does. These kinds of texts -- and especially comments in the code (explaining the intent, the idea of what the code should do, rather than describing what the code actually does) -- are very important. It's been well over two decades since the first time I got paid to write code, and I'm still learning how to comment -- describe the intent of -- my code better, so I think the sooner one starts working on that, the better.
The
fork()
part ought to be familiar. If it returns-1
, the fork failed (probably due to limits or some such), and it is a very good idea to print out theerrno
message then. The return value will be0
in the child, and the child process ID in the parent process.The
execlp()
function takes two arguments: the name of the binary file (the directories specified in the PATH environment variable will be used to search for such a binary), as well as an array of pointers to the arguments to that binary. The first argument will beargv[0]
in the new binary, i.e. the command name itself.The
execlp(argv[1], argv + 1);
call is actually quite simple to parse, if you compare it to the above description.argv[1]
names the binary to be executed.argv + 1
is basically equivalent to(char **)(&argv[1])
, i.e. it is an array of pointers that start withargv[1]
instead ofargv[0]
. Once again, I'm simply fond of theexeclp(argv[n], argv + n)
idiom, because it allows one to execute another command specified on the command line without having to worry about parsing a command line, or executing it through a shell (which is sometimes downright undesirable).The
man 7 signal
man page explains what happens to signal handlers atfork()
andexec()
. In short, the signal handlers are inherited over afork()
, but reset to defaults atexec()
. Which is, fortunately, exactly what we want, here.If we were to fork first, and then install the signal handlers, we'd have a window during which the child process already exists, but the parent still has default dispositions (mostly termination) for the signals.
Instead, we could just block these signals using e.g.
sigprocmask()
in the parent process before forking. Blocking a signal means it is made to "wait"; it will not be delivered until the signal is unblocked. In the child process, the signals could stay blocked, as the signal dispositions are reset to defaults over anexec()
anyway. In the parent process, we could then -- or before forking, it does not matter -- install the signal handlers, and finally unblock the signals. This way we would not need the atomic stuff, nor even check if the child pid is zero, since the child pid will be set to its actual value well before any signal can be delivered!The
while
loop is basically just a loop around thewaitpid()
call, until the exact child process we started exits, or something funny happens (the child process vanishes somehow). This loop contains pretty careful error checking, as well as the correctEINTR
handing if the signal handlers were to be installed without theSA_RESTART
flags.If the child process we forked exits, we check the exit status and/or reason it died, and print a diagnostic message to standard error.
Finally, the program ends with a horrible hack: instead of returning
EXIT_SUCCESS
orEXIT_FAILURE
, we return the entire status word we obtained with waitpid when the child process exited. The reason I left this in, is because it is sometimes used in practice, when you want to return the same or as similar exit status code as a child process returned with. So, it's for illustration. If you ever find yourself to be in a situation when your program should return the same exit status as a child process it forked and executed, this is still better than setting up machinery to have the process kill itself with the same signal that killed the child process. Just put a prominent comment there if you ever need to use this, and a note in the installation instructions so that those who compile the program on architectures where that might be unwanted, can fix it.