Timer to find elapsed time in a function call in C

2019-01-18 08:13发布

I want to calculate time elapsed during a function call in C, to the precision of 1 nanosecond.

Is there a timer function available in C to do it?

If yes please provide a sample code-snippet.

Pseudo code

Timer.Start()
foo();
Timer.Stop()
Display time elapsed in execution of foo()

Environment details: - using gcc 3.4 compiler on a RHEL machine

11条回答
戒情不戒烟
2楼-- · 2019-01-18 08:26

Use clock_gettime(3). For more info, type man 3 clock_gettime. That being said, nanosecond precision is rarely necessary.

查看更多
冷血范
3楼-- · 2019-01-18 08:29

You can use standard system calls like gettimeofday, if you are certain that your process gets 100% if the CPU time. I can think of many situation in which, while you are executing foo () other threads and processes might steal CPU time.

查看更多
太酷不给撩
4楼-- · 2019-01-18 08:30

On Intel and compatible processors you can use rdtsc instruction which can be wrapped into an asm() block of C code easily. It returns the value of a built-in processor cycle counter that increments on each cycle. You gain high resolution and such timing is extremely fast.

To find how fast this increments you'll need to calibrate - call this instruction twice over a fixed time period like five seconds. If you do this on a processor that shifts frequency to lower power consumption you may have problems calibrating.

查看更多
【Aperson】
5楼-- · 2019-01-18 08:31

May I ask what kind of processor you're using? If you're using an x86 processor, you can look at the time stamp counter (tsc). This code snippet:

#define rdtsc(low,high) \
     __asm__ __volatile__("rdtsc" : "=a" (low), "=d" (high))

will put the number of cycles the CPU has run in low and high respectively (it expects 2 longs; you can store the result in a long long int) as follows:

inline void getcycles (long long int * cycles)
{
  unsigned long low;
  long high;
  rdtsc(low,high);
  *cycles = high; 
  *cycles <<= 32; 
  *cycles |= low; 
}

Note that this returns the number of cycles your CPU has performed. You'll need to get your CPU speed and then figure out how many cycles per ns in order to get the number of ns elapsed.

To do the above, I've parsed the "cpu MHz" string out of /proc/cpuinfo, and converted it to a decimal. After that, it's just a bit of math, and remember that 1MHz = 1,000,000 cycles per second, and that there are 1 billion ns / sec.

查看更多
Animai°情兽
6楼-- · 2019-01-18 08:34

Making benchmarks on this scale is not a good idea. You have overhead for getting the time at the least, which can render your results unreliable if you work on nanoseconds. You can either use your platforms system calls or boost::Date_Time on a larger scale [preferred].

查看更多
走好不送
7楼-- · 2019-01-18 08:34

We all waste our time recreating this test sample. Why not post something compile ready? Anyway, here is mine with results.

CLOCK_PROCESS_CPUTIME_ID resolution: 0 sec 1 nano
clock_gettime 4194304 iterations : 459.427311 msec 0.110 microsec / call
CLOCK_MONOTONIC resolution: 0 sec 1 nano
clock_gettime 4194304 iterations : 64.498347 msec 0.015 microsec / call
CLOCK_REALTIME resolution: 0 sec 1 nano
clock_gettime 4194304 iterations : 65.494828 msec 0.016 microsec / call
CLOCK_THREAD_CPUTIME_ID resolution: 0 sec 1 nano
clock_gettime 4194304 iterations : 427.133157 msec 0.102 microsec / call
rdtsc 4194304 iterations : 115.427895 msec 0.028 microsec / call
Dummy 16110479703957395943
rdtsc in milliseconds 4194304 iterations : 197.259866 msec 0.047 microsec / call
Dummy 4.84682e+08 UltraHRTimerMs 197 HRTimerMs 197.26

#include <time.h>
#include <cstdio>
#include <string>
#include <iostream>
#include <chrono>
#include <thread>

enum { TESTRUNS = 1024*1024*4 };

class HRCounter
{
private:
    timespec start, tmp;
public:
    HRCounter(bool init = true)
    {
        if(init)
            SetStart();
    }

    void SetStart()
    {
        clock_gettime(CLOCK_MONOTONIC, &start);
    }

    double GetElapsedMs()
    {
        clock_gettime(CLOCK_MONOTONIC, &tmp);
        return (double)(tmp.tv_nsec - start.tv_nsec) / 1000000 + (tmp.tv_sec - start.tv_sec) * 1000;
    }
};

__inline__ uint64_t rdtsc(void) {
    uint32_t lo, hi;
    __asm__ __volatile__ (      // serialize
    "xorl %%eax,%%eax \n        cpuid"
    ::: "%rax", "%rbx", "%rcx", "%rdx");
    /* We cannot use "=A", since this would use %rax on x86_64 and return only the lower 32bits of the TSC */
    __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
    return (uint64_t)hi << 32 | lo;
}

inline uint64_t GetCyclesPerMillisecondImpl()
{
    uint64_t start_cyles = rdtsc();
    HRCounter counter;
    std::this_thread::sleep_for (std::chrono::seconds(3));
    uint64_t end_cyles = rdtsc();
    double elapsed_ms = counter.GetElapsedMs();
    return (end_cyles - start_cyles) / elapsed_ms;
}

inline uint64_t GetCyclesPerMillisecond()
{
    static uint64_t cycles_in_millisecond = GetCyclesPerMillisecondImpl();
    return cycles_in_millisecond;
}

class UltraHRCounter
{
private:
    uint64_t start_cyles;
public:
    UltraHRCounter(bool init = true)
    {
        GetCyclesPerMillisecond();
        if(init)
            SetStart();
    }

    void SetStart() { start_cyles = rdtsc(); }

    double GetElapsedMs()
    {
        uint64_t end_cyles = rdtsc();
        return (end_cyles - start_cyles) / GetCyclesPerMillisecond();
    }
};

int main()
{
    auto Run = [](std::string const& clock_name, clockid_t clock_id)
    {
        HRCounter counter(false);
        timespec spec;
        clock_getres( clock_id, &spec );
        printf("%s resolution: %ld sec %ld nano\n", clock_name.c_str(), spec.tv_sec, spec.tv_nsec );
        counter.SetStart();
        for ( int i = 0 ; i < TESTRUNS ; ++ i )
        {
            clock_gettime( clock_id, &spec );
        }
        double fb = counter.GetElapsedMs();
        printf( "clock_gettime %d iterations : %.6f msec %.3f microsec / call\n", TESTRUNS, ( fb ), (( fb ) * 1000) / TESTRUNS );
    };

    Run("CLOCK_PROCESS_CPUTIME_ID",CLOCK_PROCESS_CPUTIME_ID);
    Run("CLOCK_MONOTONIC",CLOCK_MONOTONIC);
    Run("CLOCK_REALTIME",CLOCK_REALTIME);
    Run("CLOCK_THREAD_CPUTIME_ID",CLOCK_THREAD_CPUTIME_ID);

    {
        HRCounter counter(false);
        uint64_t dummy;
        counter.SetStart();
        for ( int i = 0 ; i < TESTRUNS ; ++ i )
        {
            dummy += rdtsc();
        }
        double fb = counter.GetElapsedMs();
        printf( "rdtsc %d iterations : %.6f msec %.3f microsec / call\n", TESTRUNS, ( fb ), (( fb ) * 1000) / TESTRUNS );
        std::cout << "Dummy " << dummy << std::endl;
    }

    {
        double dummy;
        UltraHRCounter ultra_hr_counter;
        HRCounter counter;
        for ( int i = 0 ; i < TESTRUNS ; ++ i )
        {
            dummy += ultra_hr_counter.GetElapsedMs();
        }
        double fb = counter.GetElapsedMs();
        double final = ultra_hr_counter.GetElapsedMs();
        printf( "rdtsc in milliseconds %d iterations : %.6f msec %.3f microsec / call\n", TESTRUNS, ( fb ), (( fb ) * 1000) / TESTRUNS );
        std::cout << "Dummy " << dummy << " UltraHRTimerMs " << final << " HRTimerMs " << fb << std::endl;
    }



    return 0;
}
查看更多
登录 后发表回答