Profiling for wall-time on Linux

2020-07-06 05:15发布

I have an application that I want to profile wrt how much time is spent in various activities. Since this application is I/O intensive, I want to get a report that will summarize how much time is spent in every library/system call (wall time).

I've tried oprofile, but it seems it gives time in terms of Unhalted CPU cycles (thats cputime, not real time)

I've tried strace -T, which gives wall time, but the data generated is huge and getting the summary report is difficult (and awk/py scripts exist for this ?)

Now I'm looking upto SystemTap, but I don't find any script that is close enough and can be modified, and the onsite tutorial didn't help much either. I am not sure if what I am looking for can be done.

I need someone to point me in the right direction. Thanks a lot!

3条回答
Rolldiameter
2楼-- · 2020-07-06 05:53

See the fntimes.stp systemtap sample script. https://sourceware.org/systemtap/examples/index.html#profiling/fntimes.stp

The fntimes.stp script monitors the execution time history of a given function family (assumed non-recursive). Each time (beyond a warmup interval) is then compared to the historical maximum. If it exceeds a certain threshold (250%), a message is printed.

# stap fntimes.stp 'kernel.function("sys_*")'

or

# stap fntimes.stp 'process("/path/to/your/binary").function("*")'

The last line of that .stp script demonstrates the way to track time consumed in a given family of functions

probe $1.return { elapsed = gettimeofday_us()-@entry(gettimeofday_us()) }
查看更多
趁早两清
3楼-- · 2020-07-06 05:58

Judging from this commit, the recently released strace 4.9 supports this with:

strace -w -c

They call it "syscall latency" (and it's hard to see from the manpage alone that's what -w does).

查看更多
放荡不羁爱自由
4楼-- · 2020-07-06 06:02

Are you doing this just out of measurement curiosity, or because you want to find time-drains that you can fix to make it run faster?

If your goal is to make it run as fast as possible, then try random-pausing. It doesn't measure anything, except very roughly. It may be counter-intuitive, but what it does is pinpoint the code that will result in the greatest speed-up.

查看更多
登录 后发表回答