This is an example from <Advanced Linux Programming>, chapter 3.4.4. The programs fork() and exec() a child process. Instead of waiting for the termination of the process, I want the parent process to clean up the children process (otherwise the children process will become a zombie process) asynchronously. The can be done using the signal SIGCHLD. By setting up the signal_handler we can make the clean-up work done when the child process ends. And the code the following:
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <signal.h>
#include <string.h>
int spawn(char *program, char **arg_list){
pid_t child_pid;
child_pid = fork();
if(child_pid == 0){ // it is the child process
execvp(program, arg_list);
fprintf(stderr, "A error occured in execvp\n");
return 0;
}
else{
return child_pid;
}
}
int child_exit_status;
void clean_up_child_process (int signal_number){
int status;
wait(&status);
child_exit_status = status; // restore the exit status in a global variable
printf("Cleaning child process is taken care of by SIGCHLD.\n");
};
int main()
{
/* Handle SIGCHLD by calling clean_up_process; */
struct sigaction sigchld_action;
memset(&sigchld_action, 0, sizeof(sigchld_action));
sigchld_action.sa_handler = &clean_up_child_process;
sigaction(SIGCHLD, &sigchld_action, NULL);
int child_status;
char *arg_list[] = { //deprecated conversion from string constant to char*
"ls",
"-la",
".",
NULL
};
spawn("ls", arg_list);
return 0;
}
However, When I run the program in the terminal, the parent process never ends. And it seems that it doesn't execute the function clean_up_child_process (since it doesn't print out "Cleaning child process is taken care of by SIGCHLD."). What's the problem with this snippet of code?
for GNU/Linux users
I already read this book. Although the book talked about this mechanism as a:
quote from 3.4.4 page 59 of the book:
A more elegant solution is to notify the parent process when a child terminates.
but it just said that you can use sigaction
to handle this situation.
Here is a complete example of how to handle processes in this way.
First why do ever we use this mechanism? Well, since we do not want to synchronize all processes together.
real example
Imagine that you have 10 .mp4
files and you want to convert them to .mp3
files. Well, I junior user does this:
ffmpeg -i 01.mp4 01.mp3
and repeats this command 10 times. A little higher users does this:
ls *.mp4 | xargs -I xxx ffmpeg -i xxx xxx.mp3
This time, this command pipes all 10 mp4
files per line, each one-by-one to xargs
and then they one by one is converted to mp3
.
But I senior user does this:
ls *.mp4 | xargs -I xxx -P 0 ffmpeg -i xxx xxx.mp3
and this means if I have 10 files, create 10 processes and run them simultaneously. And there is BIG different. In the two previous command we had only 1 process; it was created then terminated and then continued to another one. But with the help of -P 0
option, we create 10 processes at the same time and in fact 10 ffmpeg
commands are running.
Now the purpose of cleaning up children asynchronously becomes cleaner. In fact we want to run some new processes but the order of those process and maybe the exit status of them is not matter for us. In this way we can run them as fast as possible and reduce the time.
First you can see man sigaction
for any more details you want.
Second seeing this signal number by:
T ❱ kill -l | grep SIGCHLD
16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP
sample code
objective: using the SIGCHLD
to clean up child process
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <string.h>
#include <wait.h>
#include <unistd.h>
sig_atomic_t signal_counter;
void signal_handler( int signal_number )
{
++signal_counter;
int wait_status;
pid_t return_pid = wait( &wait_status );
if( return_pid == -1 )
{
perror( "wait()" );
}
if( WIFEXITED( wait_status ) )
{
printf ( "job [ %d ] | pid: %d | exit status: %d\n",signal_counter, return_pid, WEXITSTATUS( wait_status ) );
}
else
{
printf( "exit abnormally\n" );
}
fprintf( stderr, "the signal %d was received\n", signal_number );
}
int main()
{
// now instead of signal function we want to use sigaction
struct sigaction siac;
// zero it
memset( &siac, 0, sizeof( struct sigaction ) );
siac.sa_handler = signal_handler;
sigaction( SIGCHLD, &siac, NULL );
pid_t child_pid;
ssize_t read_bytes = 0;
size_t length = 0;
char* line = NULL;
char* sleep_argument[ 5 ] = { "3", "4", "5", "7", "9" };
int counter = 0;
while( counter <= 5 )
{
if( counter == 5 )
{
while( counter-- )
{
pause();
}
break;
}
child_pid = fork();
// on failure fork() returns -1
if( child_pid == -1 )
{
perror( "fork()" );
exit( 1 );
}
// for child process fork() returns 0
if( child_pid == 0 ){
execlp( "sleep", "sleep", sleep_argument[ counter ], NULL );
}
++counter;
}
fprintf( stderr, "signal counter %d\n", signal_counter );
// the main return value
return 0;
}
This is what the sample code does:
- create 5 child processes
- then goes to inner-while loop and pauses for receiving a signal. See
man pause
- then when a child terminates, parent process wakes up and calls
signal_handler
function
- continue up to the last one:
sleep 9
output: (17 means SIGCHLD
)
ALP ❱ ./a.out
job [ 1 ] | pid: 14864 | exit status: 0
the signal 17 was received
job [ 2 ] | pid: 14865 | exit status: 0
the signal 17 was received
job [ 3 ] | pid: 14866 | exit status: 0
the signal 17 was received
job [ 4 ] | pid: 14867 | exit status: 0
the signal 17 was received
job [ 5 ] | pid: 14868 | exit status: 0
the signal 17 was received
signal counter 5
when you run this sample code, on the other terminal try this:
ALP ❱ ps -o time,pid,ppid,cmd --forest -g $(pgrep -x bash)
TIME PID PPID CMD
00:00:00 5204 2738 /bin/bash
00:00:00 2742 2738 /bin/bash
00:00:00 4696 2742 \_ redshift
00:00:00 14863 2742 \_ ./a.out
00:00:00 14864 14863 \_ sleep 3
00:00:00 14865 14863 \_ sleep 4
00:00:00 14866 14863 \_ sleep 5
00:00:00 14867 14863 \_ sleep 7
00:00:00 14868 14863 \_ sleep 9
As you can see a.out
process has 5 children. And They are running simultaneously. Then whenever each of them terminates, kernel sends the signal SIGCHLD
to their parent that is: a.out
NOTE
If we do not use pause
or any mechanism so that the parent can wait
for its children, then we will abandon the created processes and the upstart (= on Ubuntu
or init
) becomes parent of them. You can try it if you remove pause()
The parent process immediately returns from main()
after the child pid is returned from fork()
, it never has the opportunity to wait for the child to terminate.
I'm using Mac, so my answer may be not quite relevant, but still. I compile without any options, so executable name is a.out
.
I have the same experience with the console (the process doesn't seem to terminate), but I noticed that it's just terminal glitch, because you actually can just press Enter and your command line will be back, and actually ps
executed from other terminal window doesn't show a.out
, nor ls
which it launched.
Also if I run ./a.out >/dev/null
it finishes immediately.
So the point of the above is that everything actually terminates, just the terminal freezes for some reason.
Next, why it never prints Cleaning child process is taken care of by SIGCHLD.
. Simply because the parent process terminates before child. The SIGCHLD
signal can't be delivered to already terminated process, so the handler is never invoked.
In the book it's said that the parent process contiunes to do some other things, and if it really does then everything works fine, for example if you add sleep(1)
after spawn()
.