I think you should mostly not run NTP daemons on your machines

November 16, 2017

In my entry on switching from ntpd to chrony, I mentioned that we don't have many machines that run full time NTP daemons. In reaction, Sotiris Tsimbonis asked in his comment:

You mean you don't have many machines that run full time NTP daemons and service others as a time source, right?

How do you keep time synchronized in your systems if not by running ntpd? an ntpdate cronjob?

This brings up a heretical position of mine.

I'm a professed time maven. Not only do I run NTP daemons on my workstations, but I tinker with their configuration and server lists and enjoy checking in on their NTP synchronization status (it's fun in various ways, honest). Despite all of my enthusiasm for NTP and good time, I think that you should not run NTP daemons on your servers, especially in anything resembling a common default configuration, unless you have special needs and know what you're doing. Instead you should have almost all of your machines set their time from a trusted upstream source on boot and every so often afterward (once an hour is often convenient). This is what we do, and not just because it's easier.

In most situations, the most important thing for server time is that all of your servers are pretty close to each other. It is better that they all be wrong together than some of them be right and others be wrong, and if a server is out of sync you want it to be corrected right away rather than be slowly guided back to correct time. And you want this to happen reliably, without needing monitoring and remediation.

(If you think you're going to monitor and remediate time issues across your server fleet, ask yourself what you'll do if you detect an out of sync server. If the answer is 'reset its time', then you might as well automate that.)

A NTP daemon is usually not the best way to achieve this. NTP daemons are normally biased toward being cautious about trusting upstream time sources and prefer to change the system clock slowly, without abrupt jumps; this famously leads to various problems if your system winds up with its clock significantly out (some NTP daemons have historically given up entirely in that case). Even once you've configured your NTP daemon to not have these problems, you still need to worry about what happens if the daemon dies or stops doing anything.

(The normal biases of NTP daemons make sense in an environment where you're talking to a random collection of time sources outside of your control, some of which may be broken or even vaguely malicious.)

Modern servers in good operating condition in a decent environment don't have their time drift very much over the course of an hour (our typical adjustment is under a millisecond). Cron is reliable (and if it dies you have bigger problems than time synchronization) and it's straightforward to write a little script that force-sets the server's time from a local NTP server (your OS may already come with one). If you're worried about the NTP server being a single point of failure, run two or three. You're still going to want to monitor the health and time synchronization of your NTP server (or servers), but at least you only have a few of them.

There are situations where you need better time than this and you understand why (and how it has to be better). That's when you turn to running a NTP daemon on every server involved (among other things, like carefully considering where you're ultimately getting your NTP time from). Not before then.

Written on 16 November 2017.
« I've switched from ntpd to chrony as my NTP daemon
When you should run an NTP daemon on your servers »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Nov 16 01:03:41 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.