How to wait in a bash script for several subprocesses spawned from that script to finish and return exit code !=0 when any of the subprocesses ends with code !=0 ?
Simple script:
#!/bin/bash
for i in `seq 0 9`; do
doCalculations $i &
done
wait
The above script will wait for all 10 spawned subprocesses, but it will always give exit status 0 (see help wait
). How can I modify this script so it will discover exit statuses of spawned subprocesses and return exit code 1 when any of subprocesses ends with code !=0?
Is there any better solution for that than collecting PIDs of the subprocesses, wait for them in order and sum exit statuses?
There can be a case where the process is complete before waiting for the process. If we trigger wait for a process that is already finished, it will trigger an error like pid is not a child of this shell. To avoid such cases, the following function can be used to find whether the process is complete or not:
If you have GNU Parallel installed you can do:
GNU Parallel will give you exit code:
0 - All jobs ran without error.
1-253 - Some of the jobs failed. The exit status gives the number of failed jobs
254 - More than 253 jobs failed.
255 - Other error.
Watch the intro videos to learn more: http://pi.dk/1
I don't believe it's possible with Bash's builtin functionality.
You can get notification when a child exits:
However there's no apparent way to get the child's exit status in the signal handler.
Getting that child status is usually the job of the
wait
family of functions in the lower level POSIX APIs. Unfortunately Bash's support for that is limited - you can wait for one specific child process (and get its exit status) or you can wait for all of them, and always get a 0 result.What it appears impossible to do is the equivalent of
waitpid(-1)
, which blocks until any child process returns.Here is simple example using
wait
.Run some processes:
Then wait for them with
wait
command:Or just
wait
(without arguments) for all.This will wait for all jobs in the background are completed.
If the
-n
option is supplied, waits for the next job to terminate and returns its exit status.See:
help wait
andhelp jobs
for syntax.However the downside is that this will return on only the status of the last ID, so you need to check the status for each subprocess and store it in the variable.
Or make your calculation function to create some file on failure (empty or with fail log), then check of that file if exists, e.g.
Just store the results out of the shell, e.g. in a file.
I see lots of good examples listed on here, wanted to throw mine in as well.
I use something very similar to start/stop servers/services in parallel and check each exit status. Works great for me. Hope this helps someone out!