A gotcha with combining single-label and multi-label Prometheus metrics
Suppose that you have two metrics for roughly the same thing, but one metric is unlabeled and the other metric has meaningful labels to distinguish sub-categories. For example, suppose that you have a count of the spam messages rejected by one anti-spam system, which is not broken down by spam level, and then a count of spam messages rejected by another system that does break them down by spam level. Now you want to present a dashboard panel that displays the combined number of spam messages rejected over a time range. So you write the obvious looking PromQL query:
increase( pmx_rejects[$__range] ) + sum( increase( rspamd_rejects[$__range] ) )
pmx_rejects is a single metric and
labeled with the spam confidence level, hence the
When you put this into your dashboard panel, your dashboard panel is surprisingly blank and you are sad, and perhaps puzzled (as I was).
What is going on here is that
sum() normally returns a label-less
result because it aggregates multiple time series together, while
increase() preserves labels since it doesn't aggregate. This means
that the labels don't match between the two sides of the
means you get no results since PromQL math operators on vectors
are filters (just like boolean operators).
The straightforward way to deal with this is to get rid of the labels
from the plain
increase() side, and the simple way to do this is to
sum() on it even though there is only one time series:
sum( increase( pmx_rejects[$__range] ) ) + sum( increase( rspamd_rejects[$__range] ) )
As it happens, there is another way to write this, but I strongly don't
recommend it, because it's a hack and I think is probably not officially
supported by the PromQL specification (such as it is). The alternate
way is to replace the '
+' with '
+ on(nosuchlabel)'. What I think
on() does is throw out all labels except the nonexistent label,
and then Prometheus concludes that because both sides have no remaining
labels they can be added together.
(Prometheus will probably never change this behavior just because
it would be a language incompatibility and would probably bite
people, even if it was never explicitly officially documented as
working. Before I tried it, my expectation for '
was that it would discard time series without the label, which in
this case would discard all time series. The current behavior does
make sense if you think about it, but it is somewhat of a trap.)