2022-07-26
To be fully useful, Prometheus histograms want their cumulative sums
True Prometheus histograms have
a specific set of metrics and time series that they're made up of.
As covered in the documentation, there are a bunch of
'<basename>_bucket
' time series, a '<basename>_count
' time
series, and a '<basename>_sum
' time series that's the cumulative
sum of all of the observations that are in the histogram. How all
of this works is covered in, eg, How does a Prometheus Histogram
work?.
However, not all external sources of histogram data provide a cumulative sum. For example, ZFS IO statistic histograms just give you histogram bucket counts. When generating Prometheus histogram metrics from such histogram sources, it seems common to generate a _sum metric (well, time series) that's just 0. This gives you something that will work in many situations, but after having wrestled with histograms built this way I've come to feel that you want to avoid it if possible. Prometheus histograms are more useful with their cumulative sums, and you can't rebuild an approximation of this information in PromQL as far as I know.
(I believe that both Grafana heatmaps of histograms and 'histogram_quantile()
' will
work on such sum-less histograms, though. You won't be able to get
the average with '<thing>_sum / <thing>_count
', but perhaps you
can approximate that with histogram_quantile. On the other hand,
the median is not the mean, and
the difference may matter to you.)
If the underlying source of histogram data doesn't give you a cumulative sum, all you can do is make one up from the available histogram information. The ZFS iostats reporting code uses the midpoint of histogram buckets, and most likely you can't do better than that. Since you can readily compute this sort of cumulative sum in the code that converts from the native histogram format to Prometheus histogram metrics format, I think that you should. If you have the choice between two converters or exporters, one of which gives you a histogram sum and one of which gives you a 'sum' that's zero, I think you should take the one with the sum.
(And if you can reasonably modify a histogram exporter or converter to add a calculated sum, I think it's probably worth maintaining a custom version. Ideally you'll be able to get your change accepted upstream.)
PS: Perhaps there are clever PromQL ways to get around this lack of a <basename>_sum metric. There's a lot of tricks to PromQL and documentation on them is widely scattered and hard to find.
Sidebar: Conventional histogram buckets versus Prometheus ones
Histograms designed to be presented to people usually have independent
bucket counts, where each bucket only counts the things that fall
into its range. Prometheus histograms use cumulative bucket counts,
where every bucket's count covers all things less than or equal to
the top of its range (hence the 'le="..."
' label). Converting
independent bucket counts to cumulative bucket counts is straightforward,
but someone has to remember to do it. Not doing this when you convert
a non-Prometheus histogram to a Prometheus one can produce odd and
unhelpful results when you try to process the Prometheus histogram,
and is a mistake I'm pretty sure I've made in my early days of working
with Prometheus.