How to kill a process in Unix that is often sleepi

2019-06-02 01:11发布

问题:

I have a Unix shell script that is as follows:

(
  # Trap the HUP signal so it doesn't kill me
  trap "" HUP
  # Redirect stderr to /dev/null
  exec 2>/dev/null
  # Redirect stdin from /dev/null
  exec 0</dev/null
  # Redirect stdout to logfile
  exec 1>${LOG}

  while [ 1 ]
  do
    ps -ewwo pcpu,pid,ppid,comm,time,etime,thcount,scount,fuser,args | grep -v "migration" | grep -v "watchdog" | grep -v "ksoftirqd"
    sleep 600
  done
) &

I want this to run in the background during some tests I am running to capture process information. It runs "ps" with some options and then sleeps for 10 minutes and I want it to run indefinitely (as my tests vary in length).

My problem is that I have no way to stop this - doing a "ps -ef" only shows me the "sleep 600" command and I have no way of breaking the loop. Is there a way to kill this script or would it be better to write it so not to include a never ending loop? My only thought is to write the script to end execution after a certain time limit (ie several sets of 10 minutes) but I would prefer not to have to edit the script before each test.

回答1:

The shell subprocess invoked by the ( ... ) & construct, and all its children, will be in their own process group.

$ ps -e -o pid,pgid,ppid,tty,comm
PID  PGID  PPID TT       COMMAND
...
2827  2827  2147 pts/1    bash
2832  2827  2827 pts/1    sleep
...

The entire process group can be killed in a single action by specifying a negative number as the process ID to kill. (To do this you must also specify a signal number.)

$ kill -15 -2827
[2]+  Terminated              ( trap "" HUP; exec 2> /dev/null; ...

The PGID to kill is guaranteed to be equal to the PID of its process group leader, which in this case is the shell subprocess. So you can modify your code along the lines of

(
  # Trap the HUP signal so it doesn't kill me
  trap "" 1
  # ...
) &

# the special shell variable $! contains the PID of the most recently
# started background process
SUBPROCESS="$!"

# and later when you want to shut it all down again
kill -15 "-$SUBPROCESS"

# ensuring the subprocess is killed with the script would also be a good idea
trap "kill -15 '-$SUBPROCESS'" 0 1 2 3 15

(Note: kill -NAME and trap "..." NAME are not portable shell; however, the meanings of signal numbers 1 through 15 are portable all the way back to V7. If total portability is not an overriding concern, don't write a shell script; the moment you are tempted to reach for an unportable feature, instead stop and rewrite the entire thing in Perl, which is not only a superior programming language, it's more likely to be available on a randomly chosen Unix box than Bash is. Your future self will thank you.)

(Note to pedants: sadly, no readily available version of POSIX.1 can be taken as the reference for what is and is not portable shell, because several major proprietary-Unix vendors froze their shell environments in 1995 plus or minus two years. For complete portability, as e.g. required for autoconf scripting, I'm not aware of a reliable test other than "does this work with Solaris /bin/sh?" (Just be glad you no longer have to dig up access to HP-UX, IRIX, and AIX as well.) However, I am under the impression that you can code to POSIX.1-2001, although not -2008, if you're only interested in portability to the open-source BSDs, full-scale desktop or server Linux, and OSX. I am also under the impression that Android, busybox, and various other embedded environments do not provide all of -2001.)



回答2:

There are many ways to do this. One relatively easy one would be like this:

FILENAME=/tmp/some_unusual_name_that_would_normally_never_exist

touch ${FILENAME}

while [[ -r ${FILENAME} ]]
do
  ps ....
  sleep ....
done

Then when you want to kill your loop, just remove the file. It'll abort the loop the next time it checks...



回答3:

We're getting into bash golf here, but if don't want to wait the 600 seconds for it to exit, you can have it listen on a named pipe (a.k.a. "fifo") and exit once you talk into the pipe:

# this will be our fifo
pipe=/tmp/testpipe

# remove it when we exit
trap "rm -f $pipe" EXIT

# take care of our children if we're killed 
trap "kill -15 '-$$'" 0 1 2 3 15

# create the pipe
if [[ ! -p $pipe ]]; then
    mkfifo $pipe
fi

# put it in a function so we can background it
function ps_and_sleep() {
    while [ 1 ]
    do
        ps -ewwo pcpu,pid,ppid,comm,time,etime,thcount,scount,fuser,args | grep -v "migration" | grep -v "watchdog" | grep -v "ksoftirqd"
        sleep 600
    done
}

#... and background it
ps_and_sleep &

# the moment somebody does this:
#     echo stopitnow > /tmp/testpipe
# then we'll get the message, kill the child, and exit
if read line <$pipe; then
    kill -15 -$$
    exit
fi

Props to http://www.linuxjournal.com/content/using-named-pipes-fifos-bash