I have a problem with some zombie-like processes on a certain server that need to be killed every now and then. How can I best identify the ones that have run for longer than an hour or so?
问题:
回答1:
If they just need to be killed:
if [[ "$(uname)" = "Linux" ]];then killall --older-than 1h someprocessname;fi
If you want to see what it's matching
if [[ "$(uname)" = "Linux" ]];then killall -i --older-than 1h someprocessname;fi
The -i
flag will prompt you with yes/no for each process match.
回答2:
Found an answer that works for me:
warning: this will find and kill long running processes
ps -eo uid,pid,etime | egrep '^ *user-id' | egrep ' ([0-9]+-)?([0-9]{2}:?){3}' | awk '{print $2}' | xargs -I{} kill {}
(Where user-id is a specific user's ID with long-running processes.)
The second regular expression matches the a time that has an optional days figure, followed by an hour, minute, and second component, and so is at least one hour in length.
回答3:
For anything older than one day,
ps aux
will give you the answer, but it drops down to day-precision which might not be as useful.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 7200 308 ? Ss Jun22 0:02 init [5]
root 2 0.0 0.0 0 0 ? S Jun22 0:02 [migration/0]
root 3 0.0 0.0 0 0 ? SN Jun22 0:18 [ksoftirqd/0]
root 4 0.0 0.0 0 0 ? S Jun22 0:00 [watchdog/0]
If you're on linux or another system with the /proc filesystem, In this example, you can only see that process 1 has been running since June 22, but no indication of the time it was started.
stat /proc/<pid>
will give you a more precise answer. For example, here's an exact timestamp for process 1, which ps shows only as Jun22:
ohm ~$ stat /proc/1
File: `/proc/1'
Size: 0 Blocks: 0 IO Block: 4096 directory
Device: 3h/3d Inode: 65538 Links: 5
Access: (0555/dr-xr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2008-06-22 15:37:44.347627750 -0700
Modify: 2008-06-22 15:37:44.347627750 -0700
Change: 2008-06-22 15:37:44.347627750 -0700
回答4:
In this way you can obtain the list of the ten oldest processes:
ps -elf | sort -r -k12 | head -n 10
回答5:
Perl's Proc::ProcessTable will do the trick: http://search.cpan.org/dist/Proc-ProcessTable/
You can install it in debian or ubuntu with sudo apt-get install libproc-processtable-perl
Here is a one-liner:
perl -MProc::ProcessTable -Mstrict -w -e 'my $anHourAgo = time-60*60; my $t = new Proc::ProcessTable;foreach my $p ( @{$t->table} ) { if ($p->start() < $anHourAgo) { print $p->pid, "\n" } }'
Or, more formatted, put this in a file called process.pl:
#!/usr/bin/perl -w
use strict;
use Proc::ProcessTable;
my $anHourAgo = time-60*60;
my $t = new Proc::ProcessTable;
foreach my $p ( @{$t->table} ) {
if ($p->start() < $anHourAgo) {
print $p->pid, "\n";
}
}
then run perl process.pl
This gives you more versatility and 1-second-resolution on start time.
回答6:
Jodie C and others have pointed out that killall -i
can be used, which is fine if you want to use the process name to kill. But if you want to kill by the same parameters as pgrep -f
, you need to use something like the following, using pure bash and the /proc
filesystem.
#!/bin/sh
max_age=120 # (seconds)
naughty="$(pgrep -f offlineimap)"
if [[ -n "$naughty" ]]; then # naughty is running
age_in_seconds=$(echo "$(date +%s) - $(stat -c %X /proc/$naughty)" | bc)
if [[ "$age_in_seconds" -ge "$max_age" ]]; then # naughty is too old!
kill -s 9 "$naughty"
fi
fi
This lets you find and kill processes older than max_age
seconds using the full process name; i.e., the process named /usr/bin/python2 offlineimap
can be killed by reference to "offlineimap", whereas the killall
solutions presented here will only work on the string "python2".
回答7:
You can use bc
to join the two commands in mob's answer and get how many seconds ellapsed since the process started:
echo `date +%s` - `stat -t /proc/<pid> | awk '{print $14}'` | bc
edit:
Out of boredom while waiting for long processes to run, this is what came out after a few minutes fiddling:
#file: sincetime
#!/bin/bash
init=`stat -t /proc/$1 | awk '{print $14}'`
curr=`date +%s`
seconds=`echo $curr - $init| bc`
name=`cat /proc/$1/cmdline`
echo $name $seconds
If you put this on your path and call it like this: sincetime
it will print the process cmdline and seconds since started. You can also put this in your path:
#file: greptime
#!/bin/bash
pidlist=`ps ax | grep -i -E $1 | grep -v grep | awk '{print $1}' | grep -v PID | xargs echo`
for pid in $pidlist; do
sincetime $pid
done
And than if you run:
greptime <pattern>
where patterns is a string or extended regular expression, it will print out all processes matching this pattern and the seconds since they started. :)
回答8:
do a ps -aef
. this will show you the time at which the process started. Then using the date
command find the current time. Calculate the difference between the two to find the age of the process.
回答9:
I did something similar to the accepted answer but slightly differently since I want to match based on process name and based on the bad process running for more than 100 seconds
kill $(ps -o pid,bsdtime -p $(pgrep bad_process) | awk '{ if ($RN > 1 && $2 > 100) { print $1; }}')
回答10:
stat -t /proc/<pid> | awk '{print $14}'
to get the start time of the process in seconds since the epoch. Compare with current time (date +%s
) to get the current age of the process.
回答11:
Using ps is the right way. I've already done something similar before but don't have the source handy. Generally - ps has an option to tell it which fields to show and by which to sort. You can sort the output by running time, grep the process you want and then kill it.
HTH
回答12:
In case anyone needs this in C, you can use readproc.h and libproc:
#include <proc/readproc.h>
#include <proc/sysinfo.h>
float
pid_age(pid_t pid)
{
proc_t proc_info;
int seconds_since_boot = uptime(0,0);
if (!get_proc_stats(pid, &proc_info)) {
return 0.0;
}
// readproc.h comment lies about what proc_t.start_time is. It's
// actually expressed in Hertz ticks since boot
int seconds_since_1970 = time(NULL);
int time_of_boot = seconds_since_1970 - seconds_since_boot;
long t = seconds_since_boot - (unsigned long)(proc_info.start_time / Hertz);
int delta = t;
float days = ((float) delta / (float)(60*60*24));
return days;
}
回答13:
Came across somewhere..thought it is simple and useful
You can use the command in crontab directly ,
* * * * * ps -lf | grep "user" | perl -ane '($h,$m,$s) = split /:/,$F
+[13]; kill 9, $F[3] if ($h > 1);'
or, we can write it as shell script ,
#!/bin/sh
# longprockill.sh
ps -lf | grep "user" | perl -ane '($h,$m,$s) = split /:/,$F[13]; kill
+ 9, $F[3] if ($h > 1);'
And call it crontab like so,
* * * * * longprockill.sh
回答14:
My version of sincetime
above by @Rafael S. Calsaverini :
#!/bin/bash
ps --no-headers -o etimes,args "$1"
This reverses the output fields: elapsed time first, full command including arguments second. This is preferred because the full command may contain spaces.