A program is massively reading from the disk but I don't know which file it is reading nor where in the code it is reading.
Is there any kind of tools on linux to monitor this ?
Related question (windows) : Disk IO profiler for existing applications
If the system is really busy with IO, just look at
top
and you'll see the IO-bound process usually stuck in a D-state.strace -c myprog
is my best friend for a first attempt at all generic 'what is my application doing/where is it spending most time' questions. Strace can also attach to running processes, so you can observe the program as it's running.Another good strace trick is to output it (with
strace -o myprogrun.log
) to a log file , then view it with a modernvim
as it does a very nice job syntax highlighting the log. It's a lot easier to find things this way, as the default strace output is not very human-readable.Important thing to remember is to log to another partition/set of disks than where the IO problem is! Do not induce extra IO problems as strace can generate a lot of output. I like to use a TmpFS or ZRAM RAM-disks for such occasions.
So, you can use:
/proc/PID/fd
orlsof -p PID
to know which file your process use.
for example, with
lsof -p 27666
(assume 27666 is the PID of a.out program) you can see this: