Some views on having your system timezone set to UTC

May 17, 2020

Some people advocate for setting the system timezone on servers to UTC, for various reasons that you can read about with some Internet searches. I don't have the kind of experience that would give me strong opinions on this in general, but what I do know for sure is that for us, it would be a bad mistake to set the system timezone to UTC instead of our local time of America/Toronto. In fact I think we are a great example of a worst case for using UTC as the system timezone.

We, our servers, and most of our users are located in Toronto. Most of the usage of our systems is driven by Toronto's local time (and the Toronto work week), which means that so is when we want to schedule activities like backups, ZFS snapshots, or daily log rotations. When users report things to us they almost always use local Toronto time (eg, 'I had a connection problem at 10am'), and when they aren't in Toronto they generally don't use UTC for their reports. If we used UTC for our system timezone, almost everything we do would require us to translate between UTC time and local time; looking at logs, scheduling activities, investigating problem reports, and so on. Using Toronto's local time means we almost never have to do that.

(And when something happens to our servers because of a power outage, a power surge, an Internet connectivity problem, or whatever, almost all of the reporting and time information on it will be in Toronto local time, not UTC. Almost no one reports things relevant to us in UTC.)

Given all of this, dealing with the twice-yearly DST shift is a small price to pay, and in a sense it is honest to have to deal with it. Our users experience the DST shift, after all, and their usage shifts one hour forwards or backwards relative to UTC.

If we had servers located elsewhere (such as virtual machines in a cloud), we would probably still operate them in Toronto local time. Almost all of the reasons for doing so would still apply, although there might be some problems that now correlated with the local time of the datacenter where they were located.

The more you diverge from this, the more I suspect that it potentially makes sense to set your system timezone to UTC. If you have people working around the world, if your servers are scattered across the globe, if usage is continuous instead of driven by one location or continent, and so on, the less our issues apply to you and the more UTC is a neutral timezone. Running software in UTC also avoids it having to deal with time zone shifts for DST, which means that you don't necessarily have to test how it behaves in the face of DST shifts and then fix the bugs.

(Software sometimes can't avoid dealing with DST shifts at some level in the stack, because it handles or displays times as people perceive them and people definitely perceive DST shifts. But handling time for people is a hard issue in general with no simple solutions.)


Comments on this page:

By Andrew at 2020-06-12 01:12:17:

The worst bug I ever encountered that was caused by local timezones is as follows:

This company operated a web-based marketplace, with buyers and sellers, and the company taking a cut of every sale. They also had a "referral program", where sellers could recruit other sellers, and the seller that did the recruiting would get a small cut of every sale their recruits made for the first two years.

This was implemented with a check in the purchase flow that looked something like

   if datetime.now().subtract(2 years) < seller.signup_date && seller.referrer {
       // give the referrer their cut
   }

and then, one day in march, in the middle of the night, every single purchase failed for one hour.

Why? Because it was between 1am and 2am local time, on whichever day in March happened to be the date of the DST switch two years earlier. And when datetime.now().subtract(2 years) tried to produce a time of, say, "1:03am, March 14th 2010, America/New_York", it threw an "invalid datetime" exception, because there was no 1:03am on that date.

Some people might consider this a bug, but the author of the datetime library sensibly argued that it was correct behavior. One way to guarantee that the result of time interval arithmetic exists is to use UTC; another way is to subtract an interval that consists of a number of seconds, instead of slippery units like months and years (some libraries, like Go 'time', only support these concrete intervals). But if you ask for "two years ago, on the same day, at the same time", you might just be asking for a time that doesn't exist.

Eagle-eyed readers will notice that all purchases failed because we didn't order the operations to take advantage of short-circuiting — the datetime math was done even for accounts that were never part of the referrer program. This was considered to have no impact on performance, but it turns out that performance isn't the only thing.

If we had done something like

   if datetime.now() < seller.signup_date.add(2 years) && seller.referrer

which looks like it should be mathematically equivalent, the result would have been even stranger... purchases would have failed every single time, regardless of the time the purchase was made, but only for those sellers unlucky enough to sign up in the hour that would be missing to the "spring forward" two years later.

Written on 17 May 2020.
« Why we use city names when configuring system timezones
Syndication feeds (RSS) and social media can be complementary »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun May 17 00:30:59 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.