Here is my code to examine this:
void handler(int n) {
printf("handler %d\n", n);
int status;
if (wait(&status) < 0)
printf("%s\n", strerror(errno));
}
int main() {
struct sigaction sig;
sigemptyset(&sig.sa_mask);
sig.sa_handler = handler;
sig.sa_flags = 0;
sig.sa_restorer = NULL;
struct sigaction sigold;
sigaction(SIGCHLD, &sig, &sigold);
pid_t pid;
int status;
printf("before fork\n");
if ((pid = fork()) == 0) {
_exit(127);
} else if (pid > 0) {
printf("before waitpid\n");
if (waitpid(pid, &status, 0) < 0)
printf("%s\n", strerror(errno));
printf("after waitpid\n");
}
printf("after fork\n");
return 0;
}
The output is:
before fork
before waitpid
handler 17
No child processes
after waitpid
after fork
So, I think waitpid will block SIGCHLD and wait for child to terminate, once the child terminates, it will do something and the unblock the SIGCHLD before it returns, that's why we see "No child processes" error and "after waitpid" is after "handler 17", am I right? if not, what is the truth? How to explain the output sequence? Is there a specification for Linux or something like that to check?
The exit information for a process can only be collected once. Your output shows the signal handler being called while your code is in waitpid()
, but the handler calls wait()
and that collects the information of the child (which you throw away without reporting). Then when you get back to waitpid()
, the child exit status has been collected, so there's nothing left for waitpid()
to report on, hence the `no child processes' error.
Here's an adaptation of your program. It abuses things by using printf()
inside the signal handler function, but it seems to work despite that, testing on a Mac running macOS Sierra 10.12.4 (compiling with GCC 7.1.0).
#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>
static void handler(int n)
{
printf("handler %d\n", n);
int status;
int corpse;
if ((corpse = wait(&status)) < 0)
printf("%s: %s\n", __func__, strerror(errno));
else
printf("%s: child %d exited with status 0x%.4X\n", __func__, corpse, status);
}
int main(void)
{
struct sigaction sig = { 0 };
sigemptyset(&sig.sa_mask);
sig.sa_handler = handler;
sig.sa_flags = 0;
sigaction(SIGCHLD, &sig, NULL);
pid_t pid;
printf("before fork\n");
if ((pid = fork()) == 0)
{
_exit(127);
}
else if (pid > 0)
{
printf("before waitpid\n");
int status;
int corpse;
while ((corpse = waitpid(pid, &status, 0)) > 0 || errno == EINTR)
{
if (corpse < 0)
printf("loop: %s\n", strerror(errno));
else
printf("%s: child %d exited with status 0x%.4X\n", __func__, corpse, status);
}
if (corpse < 0)
printf("%s: %s\n", __func__, strerror(errno));
printf("after waitpid loop\n");
}
printf("after fork\n");
return 0;
}
Sample output:
before fork
before waitpid
handler 20
handler: child 29481 exited with status 0x7F00
loop: Interrupted system call
main: No child processes
after waitpid loop
after fork
The status value 0x7F00 is the normal encoding for _exit(127)
. The signal number is different for macOS from Linux; that's perfectly permissible.
To get the code to compile on Linux (Centos 7 and Ubuntu 16.04 LTS used for the test), using GCC 4.8.5 (almost antediluvian — the current version is GCC 7.1.0) and 5.4.0 respectively, using the command line:
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes \
> -Wstrict-prototypes -Wold-style-definition sg59.c -o sg59
$
I added #define _XOPEN_SOURCE 800
before the first header, and used:
struct sigaction sig;
memset(&sig, '\0', sizeof(sig));
to initialize the structure with GCC 4.8.5. That sort of shenanigan is occasionally a painful necessity to avoid compiler warnings. I note that although the #define
was necessary to expose POSIX symbols, the initializer (struct sigaction sig = { 0 };
) was accepted by GCC 5.4.0 without problems.
When I then run the program, I get very similar output to what cong reports getting in a comment:
before fork
before waitpid
handler 17
handler: No child processes
main: child 101681 exited with status 0x7F00
main: No child processes
after waitpid loop
after fork
It is curious indeed that on Linux, the process is sent a SIGCHLD signal and yet wait()
cannot wait for it in the signal handler. That is at least counter-intuitive.
We can debate how much it matters that the first argument to waitpid()
is pid
rather than 0
; the error is inevitable on the second iteration of the loop since the first collected the information from the child. In practice, it doesn't matter here. In general, it would be better to be using waitpid(0, &status, WNOHANG)
or thereabouts — depending on context, 0
instead of WNOHANG
might be better.