I would like to count the number of Instructions per Cycle executed on an ARM cortex-M4 (or cortex-M3) processor.
What it's needed is: number of instructions (executed at runtime) of the code I want to profile and number of cycles that the code takes to execute.
1 - Number of Cycles
Use the cycle counter is quite easy and straightforward.
volatile unsigned int *DWT_CYCCNT ;
volatile unsigned int *DWT_CONTROL ;
volatile unsigned int *SCB_DEMCR ;
void reset_timer(){
DWT_CYCCNT = (int *)0xE0001004; //address of the register
DWT_CONTROL = (int *)0xE0001000; //address of the register
SCB_DEMCR = (int *)0xE000EDFC; //address of the register
*SCB_DEMCR = *SCB_DEMCR | 0x01000000;
*DWT_CYCCNT = 0; // reset the counter
*DWT_CONTROL = 0;
}
void start_timer(){
*DWT_CONTROL = *DWT_CONTROL | 1 ; // enable the counter
}
void stop_timer(){
*DWT_CONTROL = *DWT_CONTROL | 0 ; // disable the counter
}
unsigned int getCycles(){
return *DWT_CYCCNT;
}
main(){
....
reset_timer(); //reset timer
start_timer(); //start timer
//Code to profile
...
myFunction();
...
stop_timer(); //stop timer
numCycles = getCycles(); //read number of cycles
...
}
2 - Number of Instructions
I found some documentation surfing the internet to count the number of instructions executed by the arm cortex-M3 and cortex-M4 (link):
# instructions = CYCCNT - CPICNT - EXCCNT - SLEEPCNT - LSUCNT + FOLDCNT
The registers that they mention are documented here (from page 11-13) and these are the memory addresses to access them:
DWT_CYCCNT = 0xE0001004
DWT_CONTROL = 0xE0001000
SCB_DEMCR = 0xE000EDFC
DWT_CPICNT = 0xE0001008
DWT_EXCCNT = 0xE000100C
DWT_SLEEPCNT = 0xE0001010
DWT_LSUCNT = 0xE0001014
DWT_FOLDCNT = 0xE0001018
The DWT_CONTROL register is used to enable counters, especially cycle counter as documented here.
But when I tried to put all together to count the number of instructions executed per cycle I didn't succeed.
Here there is a small guide on how to use them from gdb.
What is not easy is that some registers are 8 bit registers (DWT_CPICNT, DWT_EXCCNT, DWT_SLEEPCNT, DWT_LSUCNT, DWT_FOLDCNT) and when they overflow they trigger an event. I didn't find a way to collect that event. There are no code snippet that explains how to do that or interrupt routines suitable for that.
It seems moreover that using watchpoints from gdb on the addresses of those registers doesn't work. gdb is not able to stop when registers change value. E.g. on DWT_LSUCNT:
(gdb) watch *0xE0001014
Update: I found this project on GitHub explaining how to use DWT, ITM and ETM units. But I didn't check if it works! I will post updates.
Any idea on how to use them?
Thank you!