I'm trying to compare GPU to CPU performance. For the NVIDIA GPU I've been using the cudaEvent_t
types to get a very precise timing.
For the CPU I've been using the following code:
// Timers
clock_t start, stop;
float elapsedTime = 0;
// Capture the start time
start = clock();
// Do something here
.......
// Capture the stop time
stop = clock();
// Retrieve time elapsed in milliseconds
elapsedTime = (float)(stop - start) / (float)CLOCKS_PER_SEC * 1000.0f;
Apparently, that piece of code is only good if you're counting in seconds. Also, the results sometime come out quite strange.
Does anyone know of some way to create a high resolution timer in Linux?
Are you interested in wall time (how much time actually elapses) or cycle count (how many cycles)? In the first case, you should use something like
gettimeofday
.The highest resolution timer uses the
RDTSC
x86 assembly instruction. However, this measures clock ticks, so you should be sure that power saving mode is disabled.The wiki page for TSC gives a few examples: http://en.wikipedia.org/wiki/Time_Stamp_Counter
there is also CLOCK_REALTIME_HR, but I'm not sure whether it makes any difference..
To summarise information presented so far, these are the two functions required for typical applications.
Here is an example of how to use them in timing how long it takes to calculate the variance of a list of input.
After reading this thread I started testing the code for clock_gettime against c++11's chrono and they don't seem to match.
There is a huge gap between them!
The std::chrono::seconds(1) seems to be equivalent to ~30,000 of the clock_gettime
output:
clock_gettime(2)
Check out
clock_gettime
, which is a POSIX interface to high-resolution timers.If, having read the manpage, you're left wondering about the difference between
CLOCK_REALTIME
andCLOCK_MONOTONIC
, see Difference between CLOCK_REALTIME and CLOCK_MONOTONIC?See the following page for a complete example: http://www.guyrutenberg.com/2007/09/22/profiling-code-using-clock_gettime/