Bash script kill background (grand)children on Ctr

2019-02-17 18:18发布

问题:

I have a Bash script (Bash 3.2, Mac OS X 10.8) that invokes multiple Python scripts in parallel in order to better utilize multiple cores. Each Python script takes a really long time to complete.

The problem is that if I hit Ctrl+C in the middle of the Bash script, the Python scripts do not actually get killed. How can I write the Bash script so that killing it will also kill all its background children?

Here's my original "reduced test case". Unfortunately I seem to have reduced it so much that it no longer demonstrates the problem; my mistake.

set -e

cat >work.py <<EOF
import sys, time
for i in range(10):
    time.sleep(1)
    print "Tick from", sys.argv[1]
EOF

function process {
    python ./work.py $1 &
}

process one
process two
wait

Here's a complete test case, still highly reduced, but hopefully this one will demonstrate the problem. It reproduces on my machine... but then, two days ago I thought the old test case reproduced on my machine, and today it definitely doesn't.

#!/bin/bash -e
set -x

cat >work.sh <<EOF
for i in 0 1 2 3 4 5 6 7 8 9; do
    sleep 1; echo "still going"
done
EOF
chmod +x work.sh

function kill_all_jobs { jobs -p | xargs kill; }
trap kill_all_jobs SIGINT

function process {
    ./work.sh $1
}

process one &
wait $!
echo "All done!"

This code continues to print still going even after Ctrl+C. But if I move the & from outside process to inside (i.e.: ./work.sh $1 &), then Ctrl+C works as expected. I don't understand this at all!

In my real script, process contains more than one command, and the commands are long-running and must run in sequence; so I don't know how to "move the & inside process" in that case. I'm sure it's possible, but it must be non-trivial.

$ bash --version
GNU bash, version 3.2.48(1)-release (x86_64-apple-darwin12)
Copyright (C) 2007 Free Software Foundation, Inc.

EDIT: Many thanks to @AlanCurry for teaching me some Bash stuff. Unfortunately I still don't understand exactly what's going on in my examples, but it's practically a moot point, as Alan also helpfully pointed out that for my real-world parallelization problem, Bash is the wrong tool and I ought to be using a simple makefile with make -j3! make runs things in parallel where possible, and also understands Ctrl+C perfectly; problem solved (even though question unanswered).

回答1:

I got it! All you have to do is get rid of that python SIGINT handler.

cat >work.py <<'EOF'
import sys, time, signal
signal.signal(signal.SIGINT, signal.SIG_DFL)
for i in range(10):
    time.sleep(1)
    print "Tick from", sys.argv[1]
EOF 
chmod +x work.py

function process {
    python ./work.py $1
}

process one &
wait $!
echo "All done!"


回答2:

Your trap look good to me:

$ bash --version
GNU bash, version 3.2.48(1)-release (x86_64-apple-darwin11)
Copyright (C) 2007 Free Software Foundation, Inc.

$ cat ./thang 
#! /bin/bash
set -e

cat >work.py <<EOF
import sys, time
for i in range(10):
  time.sleep(1)
  print "Tick from", sys.argv[1]
EOF

function process {
  python ./work.py $1 &
}

function killstuff {
  jobs -p | xargs kill
}

trap killstuff SIGINT

process one
process two
wait

$ ./thang 
Tick from one
Tick from two
Tick from one
Tick from two
^C$ ps aux | grep python | grep -v grep
$