Tuning with adjtimex as an alternative to using ntpd

Author: L.S.Lowe. File: adjtimex. This update: 20120710. Part of Guide to the Local System.

Summary

Like many installations, we have PC desktops, laptops, file servers and web servers which, left to themselves, would gain or lose time up to around 5 seconds per day. This would be bad news, particularly for file servers, where some commands like make rely on reasonable consistency of file dates. The usual reaction to this situation on a Linux server is to configure and enable the ntpd service, so that the ntpd daemon sits in the background, communicating from time to time with its time servers (maybe 3 times per hour, and more frequently at first).

That's fine, but running the ntpd daemon is sometimes inconvenient, and arguably unnecessary. It's sometimes inconvenient, perhaps because of firewalls or lack of network, or you may simply prefer to minimise the number of daemons and active ports on your system. It's arguably unnecessary, because most PCs, laptops and servers come with a perfectly serviceable system clock, which at worst is going to have a small systematic drift.

For a non-ntpd scenario, the systematic drift can be corrected by using a suitable adjtimex command at boot time. Then, just like on Windows, you can synchronize with an external time server at boot time and thereafter perhaps once a week or at most once a day. That occasional synchronisation can be done by a ntpdate command as a cron job (for example, create file /etc/cron.daily/ntpdate containing the line ntpdate -s -b 0.fedora.pool.ntp.org and make it executable).

The parameters for the boot-time adjtimex do need to be established per system. If ntpd has been run previously for a time, then /var/lib/ntp/drift contains the required systematic drift correction, given as parts per million. Alternatively, once you've had the above daily ntpdate synchronisation in place for a few days, the required drift correction is given in the /var/log/messages file, as seconds per day. Then you can calculate what adjtimex parameters are required. The calculator below can be used to suggest an example adjtimex command line to add to your /etc/rc.d/rc.local file, say.

adjtimex calculator

ntpd start-up behaviour and occasional instability

For computers that do run ntpd, if the hwclock is a bit out of adjustment, I've noticed after a boot that ntpd can do a significant step change to the time, an appreciable time after start-up, whereas it would be much better for the user(s) if the adjustment was done at start-up before they logged on. For example:

Oct 28 11:32:42 myservr ntpd[4906]: ntpd 4.2.2p1@1.1570-o Tue Dec  8 20:20:38 UTC 2009 (1)
...
Oct 28 11:54:13 myservr ntpd[4907]: time reset -1.059769 s

More seriously, on another occasion following a reboot some time after the time-zone had changed for daylight-saving (and the hwclock was using local time not UTC, and had not been re-synched), it stepped the system clock by the full hour, some 25 minutes after start-up:

May  5 09:28:36 myservr ntpd[4959]: ntpd 4.2.2p1@1.1570-o Tue Dec  8 20:20:38 UTC 2009 (1)
...
May  5 10:53:27 myservr ntpd[4960]: time reset +3599.997926 s

A solution to this is to do an ntpdate synchronisation before the ntpd service starts, indeed, at least one distro (Fedora) has a ntpdate service available as well as ntpd.

Also I've observed that ntpd can initially get into difficulties soon after startup, which cause it to step the time in quite large apparently non-convergent steps: e.g.

Jul  9 10:25:17 myserv ntpd: ntpd startup succeeded
Jul  9 10:45:49 myserv ntpd[3422]: time reset -0.494032 s
Jul  9 11:18:10 myserv ntpd[3422]: time reset +0.616802 s
Jul  9 11:35:28 myserv ntpd[3422]: time reset +0.213453 s
Jul  9 12:03:33 myserv ntpd[3422]: time reset -0.341312 s

It may be that the origin of the startup problem is not the fault of ntpd, even if the subsequent non-convergent steps are: see Issues section below.

Another apparent problem was seen, fortuitously, when checking ntpd accuracy, when it just happened that an intermittent network problem occurred at our site boundary. The word accuracy is probably a bit strong, maybe consistency would be better, because I was using ntpd with a set of three on-site ntpd servers, and also using an ntpdate with the -q option to log the time offset via a cron task, referencing using the same three on-site ntpd servers! So it's not clear which of the client or servers, or indeed their off-site servers, all running ntpd, is accurate. It's interesting though that a network problem could cause the oscillation in the time difference between client clock and server clocks, seen in the graph.

ntpd oscillations after a network problem

Issues with tsc clocksource

The above method has worked well on some local systems for several years, but started getting problems with a Fedora 12 system. The calibration of the systematic drift was incorrect after a reboot.

I did some raw time-correction tests on a Fedora 12 system: ntpd and adjtimex were not used at all, just a raw time step adjustment using ntpdate every 10 minutes. The PC was a Dell workstation T3400 with one Q9550 quad-core processor, with kernel 2.6.32.16-150.fc12.x86_64 (and earlier kernel 2.6.32.14-127.fc12.x86_64 was no different). A cron task was set up to issue a ntpdate -s -b ntpserver every 10 minutes, so the step offset needed to correct the system clock was recorded in the system log. A further cron task rebooted the machine every hour, at 5 minutes to the hour.

With the default clocksource of tsc (time stamp counter), this gave the following anomalous results, for a 24 hour period: the 10 minute corrections were different in each hour, whereas only the initial correction after a reboot should show variability:

Reliabilility of TSC clock on PC dt127

With kernel parameter clocksource=hpet (high precision event timer), the same raw test gave the following results, which are much more sensible. The corrections are around 10x smaller, and are consistent from one reboot to the next, and so this HPET timer works much better, whether in the ntpd or adjtimex scenario; notice the time correction scale is 10x finer on this graph:

Reliability of HPET clock on PC dt127

These results for the TSC clock are probably consistent with these comments in an NTP Known Issues page.

Other system clocks

In /sys/devices/system/clocksource/clocksource0/available_clocksource on this type of PC, it says the following are available: tsc hpet acpi_pm. However, when clocksource=acpi_pm was used as a kernel parameter, it was rejected and an error message was logged:

Override clocksource acpi_pm is not HRT compatible

Setting this: echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource was accepted without an error (and that value did get set). In view of the error message when the kernel parameter is used, however, I didn't pursue this one.

The tsc clock is not listed as available on all systems (like a Dell D420 laptop with a Fedora 12 system) in which case the hpet clock may be used, and so the TSC problem and consequent instability shown when using NTP after a reboot probably does not manifest itself.

L.S.Lowe