What the default query step is for Prometheus subqueries
Subqueries are a new feature in Prometheus 2.7+; they let you nest time range queries (as in this example) and can also be useful to do things like use time based aggregation on expressions. As covered in the official documentation, their syntax is:
<instant_query> '[' <range> ':' [<resolution>] ']' [ offset <duration> ]
You always need the
:, but the resolution (the query step) is optional. This raises the obvious question
of what the subquery's resolution is if you don't specify a resolution
yourself. The documentation says about this:
<resolution>is optional. Default is the global evaluation interval.
That's all well and good, but what does the Prometheus documentation actually mean by 'the global evaluation interval'?
(This is probably a case where the documentation authors think it's obvious, except that it's not obvious to a cautious sysadmin like me.)
The answer is that the default query step of subqueries is your
rule evaluation interval, which is to say the
setting in the
global section of your Prometheus configuration
file. By default this is one minute but many example configurations
set it shorter, which probably carries through into real live setups;
ours is 15 seconds, for example.
(The rule evaluation interval is one of the many factors that influences how fast alerts trigger, which may bias you toward a short setting for it even if you don't really use recording rules.)
Given the origin of subqueries as on the fly versions of recording rules (more or less, I'm handwaving a bit here), this default makes a certain amount of sense. After all, if you made your subquery into a recording rule, it would be evaluated at every rule evaluation interval. If you don't express any opinion on the subquery resolution (by providing one explicitly), Prometheus might as well behave as if it was a recording rule and evaluate it at the same frequency.
I don't know if this makes me more or less likely to rely on the default subquery resolution when I'm using them for on the fly checks of things like how long our NTP servers go without updating their system clocks. I'm going to have to think about it, but certainly for certain sorts of subquery usage it's the right choice.
(But those subqueries are another entry.)