Remembering that Prometheus expressions act as filters

April 13, 2019

In conventional languages, comparisons like '>' and other boolean operations like 'and' give you implicit or explicit boolean results. Sometimes this is a pseudo-boolean result; in Python if you say 'A and B', you famously get either False or the value of B as the end result (instead of True). However, PromQL doesn't work this way. As I keep having to remember over and over, in Prometheus, comparisons and other boolean operators are filters.

In PromQL, when you write 'some_metric > 10', what happens is that first Prometheus generates a full instant vector for some_metric, with all of the metric points and their labels and their values, and then it filters out any metric point in the instant vector where the value isn't larger than 10. What you have left is a smaller instant vector, but all of the values of the metric points in it are their original ones.

The same thing happens with 'and'. When you write 'some_metric and other_metric', the other_metric is used only as a filter; metric points from some_metric are only included in the result set if there is the same set of labels in the other_metric instant vector. This means that the values of other_metric are irrelevant and do not propagate into the result.

The large scale effect of this is that the values that tend to propagate through your rule expression are whatever started out as the first metric you looked at (or whatever arithmetic you perform on them). Sometimes, especially in alert rules, this can bias you toward putting one condition in front of the other. For instance, suppose that you want to trigger an alert when the one-minute load average is above 20 and the five-minute load average is above 5, and you write the alert rule as:

expr: (node_load5 > 5) and (node_load1 > 20)

The value available in the alert rule and your alert messages is the value of node_load5, not node_load1, because node_load5 is what you started out the rule with. If you find the value of node_load1 more useful in your alert messages, you'll want to flip the order of these two clauses around.

As the PromQL documentation covers, you can turn comparison operations from filters into pseudo-booleans by using 'bool', as in 'some_metric > bool 10'. As far as I know, there is no way to do this with 'and', which always functions as a filter, although you can at least select what labels have to match (or what labels to ignore).

PS: For some reason I keep forgetting that 'and', 'or', and 'unless' can use 'on' and 'ignoring' to select what labels you care about. What you can't do with them, though, is propagate some labels from the right side into the result; if you need that, you have to use 'group_left' or 'group_right' and figure out how to re-frame your operation so that it involves a comparison, since 'and' and company don't work with grouping.

(I was going to confidently write an entry echoing something that I said on the Prometheus users mailing list recently, but when when I checked the documentation and performed some tests, it turned out I was wrong about an important aspect of it. So this entry is rather smaller in scope, and is written mostly to get this straight in my head since I keep forgetting the details of it.)

Written on 13 April 2019.
« WireGuard was pleasantly easy to get working behind a NAT (or several)
A VPN for me but not you: a surprise when tethering to my phone »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Apr 13 23:59:31 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.