How to wait in bash for several subprocesses to fi

2018-12-31 08:42发布

How to wait in a bash script for several subprocesses spawned from that script to finish and return exit code !=0 when any of the subprocesses ends with code !=0 ?

Simple script:

#!/bin/bash
for i in `seq 0 9`; do
  doCalculations $i &
done
wait

The above script will wait for all 10 spawned subprocesses, but it will always give exit status 0 (see help wait). How can I modify this script so it will discover exit statuses of spawned subprocesses and return exit code 1 when any of subprocesses ends with code !=0?

Is there any better solution for that than collecting PIDs of the subprocesses, wait for them in order and sum exit statuses?

27条回答
人间绝色
2楼-- · 2018-12-31 09:20

I've just been modifying a script to background and parallelise a process.

I did some experimenting (on Solaris with both bash and ksh) and discovered that 'wait' outputs the exit status if it's not zero , or a list of jobs that return non-zero exit when no PID argument is provided. E.g.

Bash:

$ sleep 20 && exit 1 &
$ sleep 10 && exit 2 &
$ wait
[1]-  Exit 2                  sleep 20 && exit 2
[2]+  Exit 1                  sleep 10 && exit 1

Ksh:

$ sleep 20 && exit 1 &
$ sleep 10 && exit 2 &
$ wait
[1]+  Done(2)                  sleep 20 && exit 2
[2]+  Done(1)                  sleep 10 && exit 1

This output is written to stderr, so a simple solution to the OPs example could be:

#!/bin/bash

trap "rm -f /tmp/x.$$" EXIT

for i in `seq 0 9`; do
  doCalculations $i &
done

wait 2> /tmp/x.$$
if [ `wc -l /tmp/x.$$` -gt 0 ] ; then
  exit 1
fi

While this:

wait 2> >(wc -l)

will also return a count but without the tmp file. This might also be used this way, for example:

wait 2> >(if [ `wc -l` -gt 0 ] ; then echo "ERROR"; fi)

But this isn't very much more useful than the tmp file IMO. I couldn't find a useful way to avoid the tmp file whilst also avoiding running the "wait" in a subshell, which wont work at all.

查看更多
余生无你
3楼-- · 2018-12-31 09:22

Here's what I've come up with so far. I would like to see how to interrupt the sleep command if a child terminates, so that one would not have to tune WAITALL_DELAY to one's usage.

waitall() { # PID...
  ## Wait for children to exit and indicate whether all exited with 0 status.
  local errors=0
  while :; do
    debug "Processes remaining: $*"
    for pid in "$@"; do
      shift
      if kill -0 "$pid" 2>/dev/null; then
        debug "$pid is still alive."
        set -- "$@" "$pid"
      elif wait "$pid"; then
        debug "$pid exited with zero exit status."
      else
        debug "$pid exited with non-zero exit status."
        ((++errors))
      fi
    done
    (("$#" > 0)) || break
    # TODO: how to interrupt this sleep when a child terminates?
    sleep ${WAITALL_DELAY:-1}
   done
  ((errors == 0))
}

debug() { echo "DEBUG: $*" >&2; }

pids=""
for t in 3 5 4; do 
  sleep "$t" &
  pids="$pids $!"
done
waitall $pids
查看更多
春风洒进眼中
4楼-- · 2018-12-31 09:22

I needed this, but the target process wasn't a child of current shell, in which case wait $PID doesn't work. I did find the following alternative instead:

while [ -e /proc/$PID ]; do sleep 0.1 ; done

That relies on the presence of procfs, which may not be available (Mac doesn't provide it for example). So for portability, you could use this instead:

while ps -p $PID >/dev/null ; do sleep 0.1 ; done
查看更多
君临天下
5楼-- · 2018-12-31 09:24

trap is your friend. You can trap on ERR in a lot of systems. You can trap EXIT, or on DEBUG to perform a piece of code after every command.

This in addition to all the standard signals.

查看更多
孤独寂梦人
6楼-- · 2018-12-31 09:25

To parallelize this...

for i in $(whatever_list) ; do
   do_something $i
done

Translate it to this...

for i in $(whatever_list) ; do echo $i ; done | ## execute in parallel...
   (
   export -f do_something ## export functions (if needed)
   export PATH ## export any variables that are required
   xargs -I{} --max-procs 0 bash -c ' ## process in batches...
      {
      echo "processing {}" ## optional
      do_something {}
      }' 
   )
  • If an error occurs in one process, it won't interrupt the other processes, but it will result in a non-zero exit code from the sequence as a whole.
  • Exporting functions and variables may or may not be necessary, in any particular case.
  • You can set --max-procs based on how much parallelism you want (0 means "all at once").
  • GNU Parallel offers some additional features when used in place of xargs -- but it isn't always installed by default.
  • The for loop isn't strictly necessary in this example since echo $i is basically just regenerating the output of $(whatever_list). I just think the use of the for keyword makes it a little easier to see what is going on.
  • Bash string handling can be confusing -- I have found that using single quotes works best for wrapping non-trivial scripts.
  • You can easily interrupt the entire operation (using ^C or similar), unlike the the more direct approach to Bash parallelism.

Here's a simplified working example...

for i in {0..5} ; do echo $i ; done |xargs -I{} --max-procs 2 bash -c '
   {
   echo sleep {}
   sleep 2s
   }'
查看更多
深知你不懂我心
7楼-- · 2018-12-31 09:26

There are already a lot of answers here, but I am surprised no one seems to have suggested using arrays... So here's what I did - this might be useful to some in the future.

n=10 # run 10 jobs
c=0
PIDS=()

while true

    my_function_or_command &
    PID=$!
    echo "Launched job as PID=$PID"
    PIDS+=($PID)

    (( c+=1 ))

    # required to prevent any exit due to error
    # caused by additional commands run which you
    # may add when modifying this example
    true

do

    if (( c < n ))
    then
        continue
    else
        break
    fi
done 


# collect launched jobs

for pid in "${PIDS[@]}"
do
    wait $pid || echo "failed job PID=$pid"
done
查看更多
登录 后发表回答