I'm coding a little program that has to sort a large array (up to 4 million text strings). Seems like I'm doing quite well at it, since a combination of radixsort and mergesort already cut the original q(uick)sort execution time in less than half.
Execution time being the main point, since this is what I'm using to benchmark my piece of code.
My question is:
Is there a better (i. e. more reliable) way of benchmarking a program than just time the execution? It kinda works, but the same program (with the same background processes running) usually has slightly different execution times if run twice.
This kinda defeats the purpose of detecting small improvements. And several small improvements could add up to a big one...
Thanks in advance for any input!
Results:
I managed to get gprof to work under Windows (using gcc and MinGW). gcc behaves poorly (considering execution time) compared to my normal compiler (tcc), but it gave me quite some insight.
Try a profiling tool, that will also show you where the program is spending its time. gprof
is the classic C profiling tool, at least on Unix.
Look at the time command. It tracks both the CPU time a process uses and the wall-clock time. You can also use something like gprof for profiling your code to find the parts of your program that are actually taking the most time. You could do a lower-tech version of profiling with timers in your code. Boost has a nice timer class, but it's easy to roll your own.
I don't think it's sufficient to just measure how long a piece of code takes to execute. Your environment is a constantly changing thing, so you have to take a statistical approach to measuring execution time.
Essentially you need to take N
measurements, discard outliers, and calculate your average, median and standard deviation running time, with an uncertainty measurement.
Here's a good blog explaining why and how to do this (with code): http://blogs.perl.org/users/steffen_mueller/2010/09/your-benchmarks-suck.html
What do you use for timing execution time so far? There's C89 clock()
in time.h
for starters. On unixoid systems you might find getitimer()
for ITIMER_VIRTUAL
to measure process CPU time. See the respective manual pages for details.
You can also use a POSIX shell's times
utility to benchmark the processor time used by a process and its children. The resolution is system dependent, like just anything about profiling. Try to wrap your C code in a loop, executing it as many times as necessary to reduce the "jitter" in the time the benchmarking reports.
Call your routine from a test harness, whereby it executes N + 1 times. Ignore the timing for the first iteration and then take the average of iterations 1..N. The reason for ignoring the first time is that is is often slightly inflated due to various effects, e.g. virtual memory, code being paged in, etc. The reason for averaging N iterations is that you get rid of artefacts caused by other processes, the scheduler, etc.
If you're running on Linux or similar You might also want to use taskset
to pin your code to a specific CPU core (assuming it's single-threaded), ideally not core 0, since this tends to handle all interrupts.