I need to track read
system calls for specific files, and I'm currently doing this by parsing the output of strace
. Since read
operates on file descriptors I have to keep track of the current mapping between fd
and path
. Additionally, seek
has to be monitored to keep the current position up-to-date in the trace.
Is there a better way to get per-application, per-file-path IO traces in Linux?
You could wait for the files to be opened so you can learn the fd and attach strace after the process launch like this:
Parsing command-line utils like strace is cumbersome; you could use ptrace() syscall instead. See
man ptrace
for details.I think overloading
open
,seek
andread
is a good solution. But just FYI if you want to parse and analyze the strace output programmatically, I did something similar before and put my code in github: https://github.com/johnlcf/Stana/wiki(I did that because I have to analyze the strace result of program ran by others, which is not easy to ask them to do LD_PRELOAD.)
First, you probably don't need to keep track because mapping between
fd
andpath
is available in/proc/PID/fd/
.Second, maybe you should use the LD_PRELOAD trick and overload in C
open
,seek
andread
system call. There are some article here and there about how to overload malloc/free.I guess it won't be too different to apply the same kind of trick for those system calls. It needs to be implemented in C, but it should take far less code and be more precise than parsing
strace
output.Probably the least ugly way to do this is to use fanotify. Fanotify is a Linux kernel facility that allows cheaply watching filesystem events. I'm not sure if it allows filtering by PID, but it does pass the PID to your program so you can check if it's the one you're interested in.
Here's a nice code sample: http://bazaar.launchpad.net/~pitti/fatrace/trunk/view/head:/fatrace.c
However, it seems to be under-documented at the moment. All the docs I could find are http://www.spinics.net/lists/linux-man/msg02302.html and http://lkml.indiana.edu/hypermail/linux/kernel/0811.1/01668.html
systemtap - a kind of DTrace reimplementation for Linux - could be of help here.
As with strace you only have the fd, but with the scripting ability it is easy to maintain the filename for an fd (unless with fun stuff like dup). There is the example script iotime that illustates it.
It only works up to a certain number of files because the hash map is size limited.