I'm using the MinGW GCC compiler on Windows. If I add -pg switch to the compiler I can generate both the EXE and the DLL with profile data.
The gmon.out is generated. But the problem is that when I use
gprof myprogram.exe gmon.out
I get no profile output (other than the table headings and other text).
gprof mydll.dll gmon.out
I get the output for that particular DLL only but not for the main exe.
Maybe both the exe and the dll wanted to generate the same file and the dll won.
The goal is getting stats for functions in both the EXE and DLL in one output.
Can gprof do that? If not, is there any tool that can do this on Windows?
It seems gprof cannot do that. sprof is not available in MinGw. So rolled my own profiler.
I used -finstrument-functions
So two functions __cyg_profile_func_enter
and __cyg_profile_func_exit
will be called before and after each functions respectively.
The actual profiling functions are exported into a DLL and called in these functions, and the DLL is linked to all EXE and DLL in question. So it can profile both of them
The code in the library is like this (removed clutter: asserts, error checking, simplified function calls for clarity).
static void proflib_init()
{
atexit(proflib_finalize);
empty(callstack);
empty(trackingData);
proflibIntialized = 1;
}
static void proflib_finalize()
{
/* Make a log. */
FILE *f = fopen("proflib_log.txt", "wt");
int i;
sortBySelftime(trackingData);
fprintf(f, "%10s%15s%15s%15s\n", "Func name", "Cumulative", "Self time", "Count");
for (i = 0; i < getlength(trackingData); i++)
{
FunctionTimeInfo *fri = trackingData[i];
fprintf(f, "%10p%15"PRIu64"%15"PRIu64"%20d\n", fri->addr, fri->cumulative, fri->selfTime, fri->count);
}
fclose(f);
}
void proflib_func_enter(void *func, void *caller)
{
FunctionTimeInfo elem;
long long pc;
pc = rdtsc(); /* Read timestamp counter from CPU. */
if (!is_prolib_initialized())
{
proflib_init();
}
/* Register self time as control moves to the child function. */
if (!isEmpty(callstack))
{
FunctionTimeInfo *top = gettop(callstack);
top->selfTime += pc - top->selfSample;
}
elem.addr = func; /* Address of function. */
elem.cumulative = pc; /* Time spent in function and functions called by this. (so far store the reference point only.)*/
elem.selfSample = pc; /* Reference point for self time counting. */
elem.count = 1; /* Number of this the function is counted. */
elem.selfTime = 0; /* Time spent in the function not including called functions. */
push(callstack, elem);
}
void proflib_func_exit(void *func, void *caller)
{
FunctionTimeInfo *fti;
FunctionTimeInfo *storedStat;
long long pc;
pc = rdtsc();
fti = gettop(callstack);
fti->cumulative = pc - fti->cumulative; /* Finalize the time. */
fti->selfTime += pc - fti->selfSample;
pop(callstack);
{
FunctionTimeInfo *top = gettop(callstack);
top->selfSample = pc; /* Set new self reference for the parent. */
}
storedStat = find(trackingData, func);
if (storedStat)
{
/* Already have an entry. */
storedStat->cumulative += fti->cumulative;
storedStat->selfTime += fti->selfTime;
storedStat->count++;
}
else
{
/* Add it as new entry. */
add(trackingData, fti);
}
}
And it produces logs like this:
Func name Cumulative Self time Count
691C83B9 1138235861408 1138235861408 1137730
00416396 539018507364 539018507364 16657216
0040A0DC 259288775768 199827541522 1914832
0041067D 876519599063 163253984165 92203200
691C9E0E 785372027387 150744125859 190020
004390F9 3608742795672 149177603708 1
0042E6A4 141485929006 116938396343 37753
00428CB8 456357355541 112610168088 193304
0041C2A4 340078363426 84539535634 114437798
691CB980 402228058455 82958191728 29675
00408A0A 79628811602 77769403982 512220
0040D8CD 93610151071 63396331438 87773597
0040D91A 60276409516 60276409516 175547194
00427C36 72489783460 58130405593 1
691C7C3D 56702394950 56702394950 3455819
691C949F 101350487028 47913486509 2977100
691CBBF3 241451044787 45153581905 29771
0043702E 920148247934 41990658926 25612
...
The function names can be found out from the MAP file. And the function at 0x691C83B9 in the DLL is indeed a suboptimal function with O(n³) complexity and is called lot of times, I must refactor that... I totally forgot that function even exist... The 0x004390F9 is the WinMain.