How can I read performance counters from the kerne

2019-07-17 05:54发布

问题:

I have been using the Linux perf tool in the user space. I want to write code that reads performance counters for a thread every time it does a context switch.

The steps required are:

1) Get a mechanism to read the performance counter registers.

2) Call step(1) from the scheduler after every context switch.

I am stuck at step(1) as I could not figure out which functions to call for reading the performance registers and how to describe an event while doing it. I tried going through the docs and also this question How do I use performance counters inside of the kernel?.

回答1:

You can actually do this entirely with perf by using tracepoint events and group leader sampling.

The sched:sched_switch is a tracepoint event triggering on every context switch. Putting that event with other events into a group with enabled group leader sampling will allow you to read the non-leader counters whenever a leader sample happens. The syntax looks like this:

perf record -e "{sched:sched_switch,cycles,instructions}:S" -a

This will record cycles values and instructions on every CPU whenever there is a context switch. You can check the output with perf script, which also allows you to read it with python programs.

If you want to monitor in your own program, you can use perf_event_open with PERF_FORMAT_GROUP and PERF_SAMPLE_READ.

The perf tools and it's underlying perf_event_open interface are very powerful, but the documentation can sometimes be lacking. If you need even more flexibility, you can use BPF and bcc.