When I run gprof on my C program it says no time accumulated for my program and shows 0 time for all function calls. However it does count the function calls.
How do I modify my program so that gprof will be able to count how much time something takes to run?
One gotcha with
gprof
: it doesn't work with code in dynamically-linked libraries. For that, you need to usesprof
. See this answer: gprof : How to generate call graph for functions in shared library that is linked to main programFirst compile you application with
-g
, and check what CPU counters are you using. If your application runs very quick than gprof could just miss all events or less that required (reduce the number of events to read).Actually profiling should show you
CPU_CLK_UNHALTED
orINST_RETIRED
events without any special switches. But with such data you'll be able only to say how well your code it performing: INST_RETIRED/CPU_CLK_UNHALTED.Try to use Intel VTune profiler - it's free for 30 days and for education.
I tried running Kinopiko's example, except I increased the number of iterations by a factor of 100.
test.c:
Then I took 10 stackshots (under VC, but you can use pstack). Here are the stackshots:
In case it isn't obvious, this tells you that:
In a nutshell, the program spends ~100% of it's time flushing to disk (or console) the output buffer as part of the printf on line 7.
(What I mean by "Cost of a line" is - it is the fraction of total time spent at the request of that line, and that's roughly the fraction of samples that contain it. If that line could be made to take no time, such as by removing it, skipping over it, or passing its work off to an infinitely fast coprocessor, that time fraction is how much the total time would shrink. So if the execution of any of these lines of code could be avoided, time would shrink by somewhere in the range of 95% to 100%. If you were to ask "What about recursion?", the answer is It Makes No Difference.)
Now, maybe you want to know something else, like how much time is spent in the loop, for example. To find that out, remove the printf because it's hogging all the time. Maybe you want to know what % of time is spent purely in CPU time, not in system calls. To get that, just throw away any stackshots that don't end in your code.
The point I'm trying to make is if you're looking for things you can fix to make the code run faster, the data gprof gives you, even if you understand it, is almost useless. By comparison, if there is some of your code that is causing more wall-clock time to be spent than you would like, stackshots will pinpoint it.
Did you specify -pg when compiling?
http://sourceware.org/binutils/docs-2.20/gprof/Compiling.html#Compiling
Once it is compiled, you run the program and then run gprof on the binary.
E.g.:
test.c:
Compile as
cc -pg test.c
, then run asa.out
, thengprof a.out
, gives meWhat are you getting?