Clock drift on Windows

2019-02-02 15:51发布

问题:

I've developed a Windows service which tracks business events. It uses the Windows clock to timestamp events. However, the underlying clock can drift quite dramatically (e.g. losing a few seconds per minute), particularly when the CPUs are working hard. Our servers use the Windows Time Service to stay in sync with domain controllers, which uses NTP under the hood, but the sync frequency is controlled by domain policy, and in any case even syncing every minute would still allow significant drift. Are there any techniques we can use to keep the clock more stable, other than using hardware clocks?

回答1:

Clock ticks should be predictable, but on most PC hardware - because they're not designed for real-time systems - other I/O device interrupts have priority over the clock tick interrupt, and some drivers do extensive processing in the interrupt service routine rather than defer it to a deferred procedure call (DPC), which means the system may not be able to serve the clock tick interrupt until (sometimes) long after it was signalled.

Other factors include bus-mastering I/O controllers which steal many memory bus cycles from the CPU, causing it to be starved of memory bus bandwidth for significant periods.

As others have said, the clock-generation hardware may also vary its frequency as component values change with temperature.

Windows does allow the amount of ticks added to the real-time clock on every interrupt to be adjusted: see SetSystemTimeAdjustment. This would only work if you had a predictable clock skew, however. If the clock is only slightly off, the SNTP client ("Windows Time" service) will adjust this skew to make the clock tick slightly faster or slower to trend towards the correct time.



回答2:

I don't know if this applies, but ...

There's an issue with Windows that if you change the timer resolution with timeBeginPeriod() a lot, the clock will drift.

Actually, there is a bug in Java's Thread wait() (and the os::sleep()) function's Windows implementation that causes this behaviour. It always sets the timer resolution to 1 ms before wait in order to be accurate (regardless of sleep length), and restores it immediately upon completion, unless any other threads are still sleeping. This set/reset will then confuse the Windows clock, which expects the windows time quantum to be fairly constant.

Sun has actually known about this since 2006, and hasn't fixed it, AFAICT!

We actually had the clock going twice as fast because of this! A simple Java program that sleeps 1 millisec in a loop shows this behaviour.

The solution is to set the time resolution yourself, to something low, and keep it there as long as possible. Use timeBeginPeriod() to control that. (We set it to 1 ms without any adverse effects.)

For those coding in Java, the easier way to fix this is by creating a thread that sleeps as long as the app lives.

Note that this will fix this issue on the machine globally, regardless of which application is the actual culprit.



回答3:

You could run "w32tm /resync" in a scheduled task .bat file. This works on Windows Server 2003.



回答4:

Other than resynching the clock more frequently, I don't think there is much you can do, other than to get a new motherboard, as your clock signal doesn't seem to be at the right frequency.



回答5:

http://www.codinghorror.com/blog/2007/01/keeping-time-on-the-pc.html

PC clocks should typically be accurate to within a few seconds per day. If you're experiencing massive clock drift-- on the order of minutes per day-- the first thing to check is your source of AC power. I've personally observed systems with a UPS plugged into another UPS (this is a no-no, by the way) that gained minutes per day. Removing the unnecessary UPS from the chain fixed the time problem. I am no hardware engineer, but I'm guessing that some timing signal in the power is used by the real-time clock chip on the motherboard.



回答6:

As already mentioned, Java programs can cause this issue.

Another solution that does not require code modification is adding the VM argument -XX:+ForceTimeHighResolution (found on the NTP support page).

9.2.3. Windows and Sun's Java Virtual Machine

Sun's Java Virtual Machine needs to be started with the >-XX:+ForceTimeHighResolution parameter to avoid losing interrupts.

See http://www.macromedia.com/support/coldfusion/ts/documents/createuuid_clock_speed.htm for more information.

From the referenced link (via the Wayback machine - original link is gone):

ColdFusion MX: CreateUUID Increases the Windows System Clock Speed

Calling the createUUID function multiple times under load in Macromedia ColdFusion MX and higher can cause the Windows system clock to accelerate. This is an issue with the Java Virtual Machine (JVM) in which Thread.sleep calls less than 10 milliseconds (ms) causes the Windows system clock to run faster. This behavior was originally filed as Sun Java Bug 4500388 (developer.java.sun.com/developer/bugParade/bugs/4500388.html) and has been confirmed for the 1.3.x and 1.4.x JVMs.

In ColdFusion MX, the createUUID function has an internal Thread.sleep call of 1 millisecond. When createUUID is heavily utilized, the Windows system clock will gain several seconds per minute. The rate of acceleration is proportional to the number of createUUID calls and the load on the ColdFusion MX server. Macromedia has observed this behavior in ColdFusion MX and higher on Windows XP, 2000, and 2003 systems.



回答7:

Increase the frequency of the re-sync. If the syncs are with your own main server on your own network there's no reason not to sync every minute.



回答8:

Sync more often. Look at the Registry entries for the W32Time service, especially "Period". "SpecialSkew" sounds like it would help you.



回答9:

Clock drift may be a consequence of the temperature; maybe you could try to get temperature more constant - using better cooling perhaps? You're never going to loose drift totally, though.

Using an external clock (GPS receiver etc...), and a statistical method to relate CPU time to Absolute Time is what we use here to synch events in distributed systems.



回答10:

Since it sounds like you have a big business:

Take an old laptop or something which isn't good for much, but seems to have a more or less reliable clock, and call it the Timekeeper. The Timekeeper's only job is to, once every (say) 2 minutes, send a message to the servers telling the time. Instead of using the Windows clock for their timestamps, the servers will put down the time from the Timekeeper's last signal, plus the elapsed time since the signal. Check the Timekeeper's clock by your wristwatch once or twice a week. This should suffice.



回答11:

What servers are you running? In desktops the times I've come across this are with Spread Spectrum FSB enabled, causes some issues with the interrupt timing which is what makes that clock tick. May want to see if this is an option in BIOS on one of those servers and turn it off if enabled.

Another option you have is to edit the time polling interval and make it much shorter using the following registry key, most likely you'll have to add it (note this is a DWORD value and the value is in seconds, e.g. 600 for 10min):

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders\NtpClient\SpecialPollInterval

Here's a full workup on it: KB816042



回答12:

I once wrote a Delphi class to handle time resynchs. It is pasted below. Now that I see the "w32tm" command mentioned by Larry Silverman, I suspect I wasted my time.

unit TimeHandler;

interface

type
  TTimeHandler = class
  private
    FServerName : widestring;
  public
    constructor Create(servername : widestring);
    function RemoteSystemTime : TDateTime;
    procedure SetLocalSystemTime(settotime : TDateTime);
  end;

implementation

uses
  Windows, SysUtils, Messages;

function NetRemoteTOD(ServerName :PWideChar; var buffer :pointer) : integer; stdcall; external 'netapi32.dll';
function NetApiBufferFree(buffer : Pointer) : integer; stdcall; external 'netapi32.dll';

type
  //See MSDN documentation on the TIME_OF_DAY_INFO structure.
  PTime_Of_Day_Info = ^TTime_Of_Day_Info;
  TTime_Of_Day_Info = record
    ElapsedDate : integer;
    Milliseconds : integer;
    Hours : integer;
    Minutes : integer;
    Seconds : integer;
    HundredthsOfSeconds : integer;
    TimeZone : LongInt;
    TimeInterval : integer;
    Day : integer;
    Month : integer;
    Year : integer;
    DayOfWeek : integer;
  end;

constructor TTimeHandler.Create(servername: widestring);
begin
  inherited Create;
  FServerName := servername;
end;

function TTimeHandler.RemoteSystemTime: TDateTime;
var
  Buffer : pointer;
  Rek : PTime_Of_Day_Info;
  DateOnly, TimeOnly : TDateTime;
  timezone : integer;
begin
  //if the call is successful...
  if 0 = NetRemoteTOD(PWideChar(FServerName),Buffer) then begin
    //store the time of day info in our special buffer structure
    Rek := PTime_Of_Day_Info(Buffer);

    //windows time is in GMT, so we adjust for our current time zone
    if Rek.TimeZone <> -1 then
      timezone := Rek.TimeZone div 60
    else
      timezone := 0;

    //decode the date from integers into TDateTimes
    //assume zero milliseconds
    try
      DateOnly := EncodeDate(Rek.Year,Rek.Month,Rek.Day);
      TimeOnly := EncodeTime(Rek.Hours,Rek.Minutes,Rek.Seconds,0);
    except on e : exception do
      raise Exception.Create(
                             'Date retrieved from server, but it was invalid!' +
                             #13#10 +
                             e.Message
                            );
    end;

    //translate the time into a TDateTime
    //apply any time zone adjustment and return the result
    Result := DateOnly + TimeOnly - (timezone / 24);
  end  //if call was successful
  else begin
    raise Exception.Create('Time retrieval failed from "'+FServerName+'"');
  end;

  //free the data structure we created
  NetApiBufferFree(Buffer);
end;

procedure TTimeHandler.SetLocalSystemTime(settotime: TDateTime);
var
  SystemTime : TSystemTime;
begin
  DateTimeToSystemTime(settotime,SystemTime);
  SetLocalTime(SystemTime);
  //tell windows that the time changed
  PostMessage(HWND_BROADCAST,WM_TIMECHANGE,0,0);
end;

end.


回答13:

I believe Windows Time Service only implements SNTP, which is a simplified version of NTP. A full NTP implementation takes into account the stability of your clock in deciding how often to sync.

You can get the full NTP server for Windows here.



标签: windows clock