I have a process that appears to be deadlocked:
# strace -p 5075
Process 5075 attached - interrupt to quit
futex(0x419cf9d0, FUTEX_WAIT, 5095, NULL
It is sitting on the "futex" system call, and seems to be indefinitely waiting on a lock. The process is shown to be consuming a large amount of CPU when "top" is run:
# top -b -n 1
top - 23:13:18 up 113 days, 4:19, 1 user, load average: 1.69, 1.74, 1.72
Tasks: 269 total, 1 running, 268 sleeping, 0 stopped, 0 zombie
Cpu(s): 8.1%us, 0.1%sy, 0.0%ni, 91.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 12165696k total, 3810476k used, 8355220k free, 29440k buffers
Swap: 8388600k total, 43312k used, 8345288k free, 879988k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5075 omdb 18 0 2373m 1.7g 26m S 199.7 14.9 102804:11 java
The process is also shown to be in a "S" - Sleep state, which makes sense if it's waiting on some resource. However, I don't understand why CPU utilization would be close to 200% if the process is in the sleep state. Why does top report such high CPU utilization on a sleeping process? Shouldn't its CPU utilization be zero?
Does your application fork child processes? The strace output may indicate that the main process is just waiting for child processes to finish their work. If so, you could try running
to trace the child processes as well.
let me add my two cents.
top shows state of the process at a particular moment of time.
but IT DOES NOT mean that this process was all the previous time in this state.
this sugestion is completely wrong.
the process could switch between R and S state million times between previous top time and current top moment
so if process switches rapidly between R and S state you can easiky catch it in S state.
However it uses cpu time between switches.
So please feel the difference between cpu_usage thing ( it describes a period of time ) and state thing ( it describes a particular moment of time ).
let me give a clear example.
some person have stolen 3 aplles from your pocket during last 10 minutes.
however right now it does not steal apples from your pocket.
stolen apples = cpu_usage, the fact that the person does not steal apples right now = state of process
so its completely wrong to get one characteristic and try to predict another characteristic.
hope it helps
There is no correlation between CPU usage as reported by
top
and process state. The man page says (emphasis mine):So, your process indeed used a huge amount of processor time since the last screen update. It is sleeping, yes, but that's because the currently running process is
top
itself (which makes sense, since it's currently updating the screen).The
top
output is perfectly normal.The load average calculations include processes that are waiting on something (mutexes/futexes, IO etc) as well as processes that are actually using the CPU. Test it by, say, running something like:
and watching top output to see what happens. It will increase the load average by 1.
If you look at this line:
the "id" in "91.8%id" means "idle". So the CPU isn't actually doing much at all.