I have been using the Linux perf tool in the user space. I want to write code that reads performance counters for a thread every time it does a context switch.
The steps required are:
1) Get a mechanism to read the performance counter registers.
2) Call step(1) from the scheduler after every context switch.
I am stuck at step(1) as I could not figure out which functions to call for reading the performance registers and how to describe an event while doing it.
I tried going through the docs and also this question How do I use performance counters inside of the kernel?.
You can actually do this entirely with perf
by using tracepoint events and group leader sampling.
The sched:sched_switch
is a tracepoint event triggering on every context switch. Putting that event with other events into a group with enabled group leader sampling will allow you to read the non-leader counters whenever a leader sample happens. The syntax looks like this:
perf record -e "{sched:sched_switch,cycles,instructions}:S" -a
This will record cycles
values and instructions
on every CPU whenever there is a context switch. You can check the output with perf script
, which also allows you to read it with python programs.
If you want to monitor in your own program, you can use perf_event_open
with PERF_FORMAT_GROUP
and PERF_SAMPLE_READ
.
The perf
tools and it's underlying perf_event_open
interface are very powerful, but the documentation can sometimes be lacking. If you need even more flexibility, you can use BPF and bcc.