OpenBSD versus Prometheus (and Go)

February 29, 2020

We have a decent number of OpenBSD machines that do important things (and that have sometimes experienced problems like running out of disk space), and we have a Prometheus based metrics and monitoring system. The Prometheus host agent has enough support for OpenBSD to be able to report on critical metrics, including things like local disk space. Despite all of this, after some investigation I've determined that it's not really sensible to even try to deploy the host agent on our OpenBSD machines. This is due to a combination of factors that have at their root OpenBSD's lack of ABI stability.

Prometheus and its host agent are written in Go. On OpenBSD, the host agent uses Go's cgo feature to call native C libraries, which means that it can't readily be cross-compiled and must be build natively on an OpenBSD machine. Because of OpenBSD's lack of ABI stability, any given version of Go officially supports only a relatively narrow range of OpenBSD versions (generally the versions that were supported at the time it was released), and in my experiments often doesn't build or work on OpenBSD versions outside of that. Often Go binaries built with a particular version of Go will only work on a narrow range of OpenBSD versions, that being the versions that it supports. Trying to use them on either a too old or a too new OpenBSD runs into either new things that aren't supported on the old versions or old things that aren't supported on the new ones.

(These things can be fundamental issues, like types of runtime dynamic loading relocation that aren't supported on older OpenBSD versions.)

Current versions of the Prometheus host agent generally won't build with versions of Go that are too old (nor will past versions, although what the minimum Go version is drops as you go back to older versions of the host agent). On OpenBSD, this means that a given version of the host agent effectively has a narrow range of OpenBSD versions it will ever run on. It will probably not run on newer OpenBSDs after a while, and it definitely won't run on older ones. And finally, historically, metrics have been renamed and shuffled around between versions of the host agent.

This means that if you have a bunch of OpenBSD machines of various different versions (as we do), at a minimum you must run different versions of the host agent on different versions of OpenBSD and they will expose some metrics with different names, which makes it hard to create unified dashboards, alerts, and so on. When a new version of the host agent comes out, you'll only be able to upgrade some of your OpenBSD machines to use it, not all of them. And in practice the range of OpenBSD versions where a particular version of host agent works well seems to be even narrower than the range that it will run on.

(All of this assumes that you can even manage to build old versions of Go and old versions of the host agent on older OpenBSD machines. I was not entirely successful at this, even for Go versions that were nominally supported on my OpenBSD versions.)

Running a whole collection of different versions of the Prometheus host agent on different machines, and having to freeze magical golden binaries that we may or may not be able to rebuild later, is not a very attractive proposition. Even if we ignored our old OpenBSD machines and only deployed the latest host agent on the latest, currently supported OpenBSD version, we would sooner or later not be able to upgrade the host agent any more and then we would have metric drift (or we would have to stop running the host agent on those machines). For a relatively minor increase in observability on machines that we almost never have problems in the first place, the whole thing isn't worth it.

(It's possible that the upcoming 1.0 release of the host agent will promise to stop changing the current metric names, which would change the calculations here a bit. I suppose I should ask about that on the Prometheus mailing list.)

Written on 29 February 2020.
« One reason for Go to prefer providing indexes in for ... range loops
The situation with Go on OpenBSD »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Feb 29 19:04:23 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.