A gotcha with combining single-label and multi-label Prometheus metrics

October 31, 2020

Suppose that you have two metrics for roughly the same thing, but one metric is unlabeled and the other metric has meaningful labels to distinguish sub-categories. For example, suppose that you have a count of the spam messages rejected by one anti-spam system, which is not broken down by spam level, and then a count of spam messages rejected by another system that does break them down by spam level. Now you want to present a dashboard panel that displays the combined number of spam messages rejected over a time range. So you write the obvious looking PromQL query:

increase( pmx_rejects[$__range] ) +
   sum( increase( rspamd_rejects[$__range] ) )

(Here pmx_rejects is a single metric and rspamd_rejects is labeled with the spam confidence level, hence the sum().)

When you put this into your dashboard panel, your dashboard panel is surprisingly blank and you are sad, and perhaps puzzled (as I was).

What is going on here is that sum() normally returns a label-less result because it aggregates multiple time series together, while increase() preserves labels since it doesn't aggregate. This means that the labels don't match between the two sides of the +, which means you get no results since PromQL math operators on vectors are filters (just like boolean operators).

The straightforward way to deal with this is to get rid of the labels from the plain increase() side, and the simple way to do this is to use sum() on it even though there is only one time series:

sum( increase( pmx_rejects[$__range] ) ) +
sum( increase( rspamd_rejects[$__range] ) )

As it happens, there is another way to write this, but I strongly don't recommend it, because it's a hack and I think is probably not officially supported by the PromQL specification (such as it is). The alternate way is to replace the '+' with '+ on(nosuchlabel)'. What I think the on() does is throw out all labels except the nonexistent label, and then Prometheus concludes that because both sides have no remaining labels they can be added together.

(Prometheus will probably never change this behavior just because it would be a language incompatibility and would probably bite people, even if it was never explicitly officially documented as working. Before I tried it, my expectation for '+ on(nosuchlabel)' was that it would discard time series without the label, which in this case would discard all time series. The current behavior does make sense if you think about it, but it is somewhat of a trap.)

Written on 31 October 2020.
« Some settings you want to make to CyberPower's UPS Powerpanel daemon
Python's global statement and imports in functions »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Oct 31 22:30:38 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.