Suppose all the cores in my CPU have same frequency, technically I can synchronize system time and time stamp counter pairs for each core every millisecond or so. Then based on the current core I'm running with, I can take the current rdtsc
value and using the tick delta divided by the core frequency I'm able to estimate the time passed since I last synchronized the system time and time stamp counter pair and to deduce the current system time without the overhead of system call from my current thread (assuming no locks are needed to retrieve the above data).
This works great in theory but in practice I found that sometimes I get more ticks then I would expect, that is, if my core frequency is 1GHz and I took system time and time stamp counter pair 1 millisecond ago I would expect to see a delta in the ticks which is around 10^6 ticks, but actually I found it can be anywhere between 10^6 and 10^7.
I'm not sure what is wrong, can anyone share his thoughts on how to calculate system time using rdtsc
? My main objective is to avoid the need to perform system call every time I want to know the system time and be able to perform a calculation in user space that will give my a good estimation of it (currently I define a good estimation as a result which is with in 10 micro seconds interval from the real system time.
相关问题
- Sorting 3 numbers without branching [closed]
- How to compile C++ code in GDB?
- Why does const allow implicit conversion of refere
- thread_local variables initialization
- What uses more memory in c++? An 2 ints or 2 funct
相关文章
- Class layout in C++: Why are members sometimes ord
- How to mock methods return object with deleted cop
- Which is the best way to multiply a large and spar
- C++ default constructor does not initialize pointe
- Selecting only the first few characters in a strin
- What exactly do pointers store? (C++)
- Converting glm::lookat matrix to quaternion and ba
- What is the correct way to declare and use a FILE
Don't do that -using yourself directly the
RDTSC
machine instruction- (because your OS scheduler could reschedule other threads or processes at arbitrary moments, or slow down the clock). Use a function provided by your library or OS.On Linux, read time(7) then use clock_gettime(2) which is really quick (and does not involve any slow system call) thanks to vdso(7).
On a C++11 compliant implementation, simply use the standard
<chrono>
header. And standard C has clock(3) (giving microsecond precision). Both would use on Linux good enough time measurement functions (so indirectlyvdso
)Last time I measured
clock_gettime
it often took less than 4 nanoseconds per call.The idea is not unsound but it is not suited for user-mode applications, for which, as @Basile suggested, there are better alternatives.
Intel itself suggests to use the TSC as a wall-clock:
However, care must be taken.
The TSC is not always invariant
In older processors the TSC is incremented on every internal clock cycle, it was not a wall-clock.
Quoting Intel
If you only have a variant TSC, the measurement are unreliable for tracking time. There is hope for invariant TSC though.
The TSC is not incremented at the frequency advised on the brand string
Still quoting Intel
You can't simply take the frequency written on the box of the processor.
See below.
rdtsc
is not serialisingYou need to serialise it from above and below.
See this.
The TSC is based on the ART (Always Running Timer) when invariant
The correct formula is
See section 17.15.4 of the Intel manual 3.
Of course, you have to solve for
ART_Value
since you start from aTSC_Value
. You can ignore the K as you are interested in deltas only. From theART_Value
delta you can get the time elapsed once you know the frequency of the ART. This is given as k * B where k is a constant in the MSRMSR_PLATFORM_INFO
and B is 100Mhz or 133+1/3 Mhz depending on the processor.As @BeeOnRope pointed out, from Skylake the ART crystal frequency is no longer the bus frequency.
The actual values, maintained by Intel, can be found in the turbostat.c file.
The TSC is not incremented when the processor enter a deep sleep
This should not be a problem on single socket machines but the Linux kernel has some comment about the TSC being reset even on non-deep sleep states.
The context switches will poison the measurements
There nothing you can do about it.
This actually prevents you from time-keeping with the TSC.