2020-06-07
A Go time package gotcha with parsing time strings that use named time zones
Go has a generally well regarded time
package. One of the things it can do is parse a string representation
of a time based on a specification of the time format, using
time.Parse()
; for example,
to parse times like "Sat Mar 7 11:06:39 PST 2015" or "Sat, 07 Mar
2015 11:06:39 -0800" (which are in Unix date format and 'RFC 1123 Z'
format respectively). As usual, these parsed time.Time
values have
a location, ie a time zone. However, if you're
dealing with time strings with named time zones, like 'PST', this
parsing has a very large catch. This catch is sort of spelled out in
the official documentation, but not quite completely clearly:
When parsing a time with a zone abbreviation like MST, if the zone abbreviation has a defined offset in the current location, then that offset is used. The zone abbreviation "UTC" is recognized as UTC regardless of location. If the zone abbreviation is unknown, Parse records the time as being in a fabricated location with the given zone abbreviation and a zero offset.
MST is a widely known zone abbreviation, so you might think that it will always have 'a defined offset in the current location'. This is not so. If your current location doesn't ever use 'MST' as a zone abbreviation, then it's not considered 'a defined offset' and you get a time that claims it is in 'MST' but that has a 0 offset from UTC. This is not a correctly parsed time as any human being would understand it. Go is making up an offset in order to not report an error.
What Go means by 'a defined offset in the current location' is that
you can use 'EST' and 'EDT' if you're in Eastern time. This means that
Go will parse a time string containing a named time zone differently
depending on your local time zone. If you parse a string that uses
'MST' as its time zone and you are in Mountain time, you will get one
time.Time
value; if you are in Eastern time (or this server is in
UTC time), you will get a completely
different time.Time
value.
(This implies that if you write out a time string using a named time zone, change your time zone (either personally or server wide), and then parse the time string again, you will get a different time. One way to change your personal time zone is to move a file containing time strings from one server to another.)
This also means that it very much matters whether the source of the time string is using named time zones or numeric time zone offsets. The choice between 'RFC 1123' time format (using named time zones) and 'RFC 1123 Z' format (using numeric values) will give you what is theoretically the same time that Go will not infrequently parse as very different time zones. Only time formats using numeric time zone offsets are safe to use with Go (and even then there is a catch when later formatting them).
My personal opinion is that this is a serious bug in Go's time
parsing. If a named time zone offset is given and Go cannot safely
determine its actual zone offset, the parse should fail with an
error. Turning "Sat Mar 7 11:06:39 PST 2015" into March 7 11:06:39
UTC 2015 is not correct behavior; instead it is actively dangerous.
If this means that too many time strings fail to parse, then Go
time parsing needs to get smarter about looking up popular named
time zone offsets, or it should provide a 'parse liberally' function
with the current behavior of time.Parse()
.
Another consequence of this behavior is that a time.Time
time
zone that is printed as 'EST' is not always 'EST' (and the same for
any named time offset). Sometimes it is 'EST (-0500)' and sometimes
it is 'EST (+0000)', ie 'UTC but we are claiming that it is called
EST'. In my opinion, Go should also stop doing this. If it is going
to accept 'EST' but treat it as UTC, it should actually set the
location to UTC so that people are not fooled by how the same two
times, apparently equal because they format with the same output,
are in fact not equal.
(To Go's credit, the default string format for time.Time
values, as
shown in fmt
's %v
format, does show
both the time zone name and the numerical offset. This gives you odd but
honest output like '2015-03-07 11:06:39 +0000 MST'. But if you format
with just the named time zone, you can have two times that format the
same but don't compare equal.)
(This entire issue was brought to my attention by James Antill's
comments on my entry about how time.Time
values have locations.)