CPU contention (wait time) for a process in Linux

2019-08-04 16:14发布

问题:

How can I check how long a process spends waiting for the CPU in a Linux box?

For example, in a loaded system I want to check how long a SQL*Loader (sqlldr) process waits.

It would be useful if there is a command line tool to do this.

回答1:

I've quickly slapped this together. It prints out the smallest and largest "interferences" from task switching...

#include <sys/time.h>
#include <stdio.h>

double seconds()
{
    timeval t;
    gettimeofday(&t, NULL);
    return t.tv_sec + t.tv_usec / 1000000.0;
}

int main()
{
    double min = 999999999, max = 0;
    while (true)
    {
        double c = -(seconds() - seconds());
        if (c < min)
        {
            min = c;
            printf("%f\n", c);
            fflush(stdout);
        }
        if (c > max)
        {
            max = c;
            printf("%f\n", c);
            fflush(stdout);
        }
    }

    return 0;
}


回答2:

Here's how you should go about measuring it. Have a number of processes, greater than the number of your processors * cores * threading capability wait (block) on an event that will wake them up all at the same time. One such event is a multicast network packet. Use an instrumentation library like PAPI (or one more suited to your needs) to measure the differences in real and virtual "wakeup" time between your processes. From several iterations of the experiment you can get an estimate of the CPU contention time for your processes. Obviously, it's not going to be at all accurate for multicore processors, but maybe it'll help you.

Cheers.



回答3:

I had this problem some time back. I ended up using getrusage : You can get detailed help at : http://www.opengroup.org/onlinepubs/009695399/functions/getrusage.html

getrusage populates the rusage struct.


Measuring Wait Time with getrusage

You can call getrusage at the beginning of your code and then again call it at the end, or at some appropriate point during execution. You have then initial_rusage and final_rusage. The user-time spent by your process is indicated by rusage->ru_utime.tv_sec and system-time spent by the process is indicated by rusage->ru_stime.tv_sec.

Thus the total user-time spent by the process will be: user_time = final_rusage.ru_utime.tv_sec - initial_rusage.ru_utime.tv_sec

The total system-time spent by the process will be: system_time = final_rusage.ru_stime.tv_sec - initial_rusage.ru_stime.tv_sec

If total_time is the time elapsed between the two calls of getrusage then the wait time will be wait_time = total_time - (user_time + system_time)

Hope this helps