可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I recently had a Linux process which "leaked" file descriptors: It opened them and didn't properly close some of them.
If I had monitored this, I could tell - in advance - that the process was reaching its limit.
Is there a nice, Bash\Python way to check the FD usage ratio for a given process in a Ubuntu Linux system?
EDIT:
I now know how to check how many open file descriptors are there; I only need to know how many file descriptors are allowed for a process. Some systems (like Amazon EC2) don't have the /proc/pid/limits
file.
Thanks,
Udi
回答1:
Count the entries in /proc/<pid>/fd/
. The hard and soft limits applying to the process can be found in /proc/<pid>/limits
.
回答2:
The only interfaces provided by the Linux kernel to get resource limits are getrlimit()
and /proc/
pid/limits
. getrlimit()
can only get resource limits of the calling process. /proc/
pid/limits
allows you to get the resource limits of any process with the same user id, and is available on RHEL 5.2, RHEL 4.7, Ubuntu 9.04, and any distribution with a 2.6.24 or later kernel.
If you need to support older Linux systems then you will have to get the process itself to call getrlimit()
. Of course the easiest way to do that is by modifying the program, or a library that it uses. If you are running the program then you could use LD_PRELOAD
to load your own code into the program. If none of those are possible then you could attach to the process with gdb and have it execute the call within the process. You could also do the same thing yourself using ptrace()
to attach to the process, insert the call in its memory, etc., however this is very complicated to get right and is not recommended.
With appropriate privileges, the other ways to do this would involve looking through kernel memory, loading a kernel module, or otherwise modifying the kernel, but I am assuming that these are out of the question.
回答3:
to see the top 20 file handle using processes:
for x in `ps -eF| awk '{ print $2 }'`;do echo `ls /proc/$x/fd 2> /dev/null | wc -l` $x `cat /proc/$x/cmdline 2> /dev/null`;done | sort -n -r | head -n 20
the output is in the format file handle count, pid, cmndline for process
example output
701 1216 /sbin/rsyslogd-n-c5
169 11835 postgres: spaceuser spaceschema [local] idle
164 13621 postgres: spaceuser spaceschema [local] idle
161 13622 postgres: spaceuser spaceschema [local] idle
161 13618 postgres: spaceuser spaceschema [local] idle
回答4:
You can try to write script which periodically call lsof -p {PID}
on given pid.
回答5:
You asked for bash/python methods. ulimit would be the best bash approach (short of munging through /proc/$pid/fd
and the like by hand). For python, you could use the resource module.
import resource
print(resource.getrlimit(resource.RLIMIT_NOFILE))
$ python test.py
(1024, 65536)
resource.getrlimit
corresponds to the getrlimit
call in a C program. The results represent the current and maximum values for the requested resource. In the above example, the current (soft) limit is 1024. The values are typical defaults on Linux systems these days.
回答6:
In CentOS 6 and below (anything using GCC 3), you may find that adjusting the kernel limits does not resolve the issue. This is because there is a FD_SETSIZE value that is set at compile time in use by GCC. For this, you will need to increase the value and then re-compile the process.
Also, you may find that you are leaking file descriptors due to known issues in libpthread if you are using that library. This call was integrated into GCC in GCC 4 / CentOS7 / RHEL 7 and this seems to have fixed the threading issues.
回答7:
Python wrapper using the excellent psutil package:
import psutil
for p in psutil.process_iter(attrs=['pid', 'name', 'username', 'num_fds']):
try:
soft, hard = p.rlimit(psutil.RLIMIT_NOFILE)
cur = p.info['num_fds']
usage = int(cur / soft * 100)
print('{:>2d}% {}/{}/{}'.format(
usage,
p.info['pid'],
p.info['username'],
p.info['name'],
))
except psutil.NoSuchProcess:
pass