I have a multithreaded application that installs a handler for SIGCHLD that logs and reaps the child processes.
The problem I see starts when I'm doing a call to system()
. system()
needs to wait for the child process to end and reaps him itself since it needs the exit code. This is why it calls sigprocmask()
to block SIGCHLD. But in my multithreaded application, the SIGCHLD is still called in a different thread and the child is reaped before system()
has a chance to do so.
Is this a known problem in POSIX?
One way around this I thought of is to block SIGCHLD in all other threads but this is not really realistic in my case since not all threads are directly created by my code.
What other options do I have?
Yes, it's a known (or at least strongly intimated) problem.
Blocking SIGCHLD while waiting for the child to terminate prevents the application from catching the signal and obtaining status from system()'s child process before system() can get the status itself.
....
Note that if the application is catching SIGCHLD signals, it will receive such a signal before a successful system() call returns.
(From the documentation for system()
, emphasis added.)
So, POSIXly you are out of luck, unless your implementation happens to queue SIGCHLD. If it does, you can of course keep a record of pids you forked, and then only reap the ones you were expecting.
Linuxly, too, you are out of luck, as signalfd appears also to collapse multiple SIGCHLDs.
UNIXly, however, you have lots of clever and too-clever techniques available to manage your own children and ignore those of third-party routines. I/O multiplexing of inherited pipes is one alternative to SIGCHLD catching, as is using a small, dedicated "spawn-helper" to do your forking and reaping in a separate process.
Since you have threads you cannot control, I recommend you write a preloaded library to interpose the system()
call (and perhaps also popen()
etc.) with your own implementation. I'd also include your SIGCHLD
handler in the library, too.
If you don't want to run your program via env LD_PRELOAD=libwhatever.so yourprogram
, you can add something like
const char *libs;
libs = getenv("LD_PRELOAD");
if (!libs || !*libs) {
setenv("LD_PRELOAD", "libwhatever.so", 1);
execv(argv[0], argv);
_exit(127);
}
at the start of your program, to have it re-execute itself with LD_PRELOAD appropriately set. (Note that there are quirks to consider if your program is setuid or setgid; see man ld.so
for details. In particular, if libwhatever.so
is not installed in a system library directory, you must specify a full path.)
One possible approach would be to use a lockless array (using atomic built-ins provided by the C compiler) of pending children. Instead of waitpid()
, your system()
implementation allocates one of the entries, sticks the child PID in there, and waits on a semaphore for the child to exit instead of calling waitpid()
.
Here is an example implementation:
#define _GNU_SOURCE
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <signal.h>
#include <semaphore.h>
#include <dlfcn.h>
#include <errno.h>
/* Maximum number of concurrent children waited for.
*/
#define MAX_CHILDS 256
/* Lockless array of child processes waited for.
*/
static pid_t child_pid[MAX_CHILDS] = { 0 }; /* 0 is not a valid PID */
static sem_t child_sem[MAX_CHILDS];
static int child_status[MAX_CHILDS];
/* Helper function: allocate a child process.
* Returns the index, or -1 if all in use.
*/
static inline int child_get(const pid_t pid)
{
int i = MAX_CHILDS;
while (i-->0)
if (__sync_bool_compare_and_swap(&child_pid[i], (pid_t)0, pid)) {
sem_init(&child_sem[i], 0, 0);
return i;
}
return -1;
}
/* Helper function: release a child descriptor.
*/
static inline void child_put(const int i)
{
sem_destroy(&child_sem[i]);
__sync_fetch_and_and(&child_pid[i], (pid_t)0);
}
/* SIGCHLD signal handler.
* Note: Both waitpid() and sem_post() are async-signal safe.
*/
static void sigchld_handler(int signum __attribute__((unused)),
siginfo_t *info __attribute__((unused)),
void *context __attribute__((unused)))
{
pid_t p;
int status, i;
while (1) {
p = waitpid((pid_t)-1, &status, WNOHANG);
if (p == (pid_t)0 || p == (pid_t)-1)
break;
i = MAX_CHILDS;
while (i-->0)
if (p == __sync_fetch_and_or(&child_pid[i], (pid_t)0)) {
child_status[i] = status;
sem_post(&child_sem[i]);
break;
}
/* Log p and status? */
}
}
/* Helper function: close descriptor, without affecting errno.
*/
static inline int closefd(const int fd)
{
int result, saved_errno;
if (fd == -1)
return EINVAL;
saved_errno = errno;
do {
result = close(fd);
} while (result == -1 && errno == EINTR);
if (result == -1)
result = errno;
else
result = 0;
errno = saved_errno;
return result;
}
/* Helper function: Create a close-on-exec socket pair.
*/
static int commsocket(int fd[2])
{
int result;
if (socketpair(AF_UNIX, SOCK_STREAM, 0, fd)) {
fd[0] = -1;
fd[1] = -1;
return errno;
}
do {
result = fcntl(fd[0], F_SETFD, FD_CLOEXEC);
} while (result == -1 && errno == EINTR);
if (result == -1) {
closefd(fd[0]);
closefd(fd[1]);
return errno;
}
do {
result = fcntl(fd[1], F_SETFD, FD_CLOEXEC);
} while (result == -1 && errno == EINTR);
if (result == -1) {
closefd(fd[0]);
closefd(fd[1]);
return errno;
}
return 0;
}
/* New system() implementation.
*/
int system(const char *command)
{
pid_t child;
int i, status, commfd[2];
ssize_t n;
/* Allocate the child process. */
i = child_get((pid_t)-1);
if (i < 0) {
/* "fork failed" */
errno = EAGAIN;
return -1;
}
/* Create a close-on-exec socket pair. */
if (commsocket(commfd)) {
child_put(i);
/* "fork failed" */
errno = EAGAIN;
return -1;
}
/* Create the child process. */
child = fork();
if (child == (pid_t)-1)
return -1;
/* Child process? */
if (!child) {
char *args[4] = { "sh", "-c", (char *)command, NULL };
/* If command is NULL, return 7 if sh is available. */
if (!command)
args[2] = "exit 7";
/* Close parent end of comms socket. */
closefd(commfd[0]);
/* Receive one char before continuing. */
do {
n = read(commfd[1], &status, 1);
} while (n == (ssize_t)-1 && errno == EINTR);
if (n != 1) {
closefd(commfd[1]);
_exit(127);
}
/* We won't receive anything else. */
shutdown(commfd[1], SHUT_RD);
/* Execute the command. If successful, this closes the comms socket. */
execv("/bin/sh", args);
/* Failed. Return the errno to the parent. */
status = errno;
{
const char *p = (const char *)&status;
const char *const q = (const char *)&status + sizeof status;
while (p < q) {
n = write(commfd[1], p, (size_t)(q - p));
if (n > (ssize_t)0)
p += n;
else
if (n != (ssize_t)-1)
break;
else
if (errno != EINTR)
break;
}
}
/* Explicitly close the socket pair. */
shutdown(commfd[1], SHUT_RDWR);
closefd(commfd[1]);
_exit(127);
}
/* Parent process. Close the child end of the comms socket. */
closefd(commfd[1]);
/* Update the child PID in the array. */
__sync_bool_compare_and_swap(&child_pid[i], (pid_t)-1, child);
/* Let the child proceed, by sending a char via the socket. */
status = 0;
do {
n = write(commfd[0], &status, 1);
} while (n == (ssize_t)-1 && errno == EINTR);
if (n != 1) {
/* Release the child entry. */
child_put(i);
closefd(commfd[0]);
/* Kill the child. */
kill(child, SIGKILL);
/* "fork failed". */
errno = EAGAIN;
return -1;
}
/* Won't send anything else over the comms socket. */
shutdown(commfd[0], SHUT_WR);
/* Try reading an int from the comms socket. */
{
char *p = (char *)&status;
char *const q = (char *)&status + sizeof status;
while (p < q) {
n = read(commfd[0], p, (size_t)(q - p));
if (n > (ssize_t)0)
p += n;
else
if (n != (ssize_t)-1)
break;
else
if (errno != EINTR)
break;
}
/* Socket closed with nothing read? */
if (n == (ssize_t)0 && p == (char *)&status)
status = 0;
else
if (p != q)
status = EAGAIN; /* Incomplete error code, use EAGAIN. */
/* Close the comms socket. */
shutdown(commfd[0], SHUT_RDWR);
closefd(commfd[0]);
}
/* Wait for the command to complete. */
sem_wait(&child_sem[i]);
/* Did the command execution fail? */
if (status) {
child_put(i);
errno = status;
return -1;
}
/* Command was executed. Return the exit status. */
status = child_status[i];
child_put(i);
/* If command is NULL, then the return value is nonzero
* iff the exit status was 7. */
if (!command) {
if (WIFEXITED(status) && WEXITSTATUS(status) == 7)
status = 1;
else
status = 0;
}
return status;
}
/* Library initialization.
* Sets the sigchld handler,
* makes sure pthread library is loaded, and
* unsets the LD_PRELOAD environment variable.
*/
static void init(void) __attribute__((constructor));
static void init(void)
{
struct sigaction act;
int saved_errno;
saved_errno = errno;
sigemptyset(&act.sa_mask);
act.sa_sigaction = sigchld_handler;
act.sa_flags = SA_NOCLDSTOP | SA_RESTART | SA_SIGINFO;
sigaction(SIGCHLD, &act, NULL);
(void)dlopen("libpthread.so.0", RTLD_NOW | RTLD_GLOBAL);
unsetenv("LD_PRELOAD");
errno = saved_errno;
}
If you save the above as say child.c
, you can compile it into libchild.so
using
gcc -W -Wall -O3 -fpic -fPIC -c child.c -lpthread
gcc -W -Wall -O3 -shared -Wl,-soname,libchild.so child.o -ldl -lpthread -o libchild.so
If you have a test program that does system()
calls in various threads, you can run it with system()
interposed (and children automatically reaped) using
env LD_PRELOAD=/path/to/libchild.so test-program
Note that depending on exactly what those threads that are not under your control do, you may need to interpose further functions, including signal()
, sigaction()
, sigprocmask()
, pthread_sigmask()
, and so on, to make sure those threads do not change the disposition of your SIGCHLD
handler (after installed by the libchild.so
library).
If those out-of-control threads use popen()
, you can interpose that (and pclose()
) with very similar code to system()
above, just split into two parts.
(If you are wondering why my system()
code bothers to report the exec()
failure to the parent process, it's because I normally use a variant of this code that takes the command as an array of strings; this way it correctly reports if the command was not found, or could not be executed due to insufficient privileges, etc. In this particular case the command is always /bin/sh
. However, since the communications socket is needed anyway to avoid racing between child exit and having up-to-date PID in the *child_pid[]* array, I decided to leave the "extra" code in.)
For those who are still looking for the answer, there is an easier way to solve this problem:
Rewrite SIGCHLD handler to use waitid call with flags WNOHANG|WNOWAIT to check child's PID before reaping them. You can optionally check /proc/PID/stat (or similar OS interface) for command name.