Wandering Thoughts archives

2019-03-10

What the default query step is for Prometheus subqueries

Subqueries are a new feature in Prometheus 2.7+; they let you nest time range queries (as in this example) and can also be useful to do things like use time based aggregation on expressions. As covered in the official documentation, their syntax is:

<instant_query> '[' <range> ':' [<resolution>] ']' [ offset <duration> ]

You always need the :, but the resolution (the query step) is optional. This raises the obvious question of what the subquery's resolution is if you don't specify a resolution yourself. The documentation says about this:

  • <resolution> is optional. Default is the global evaluation interval.

That's all well and good, but what does the Prometheus documentation actually mean by 'the global evaluation interval'?

(This is probably a case where the documentation authors think it's obvious, except that it's not obvious to a cautious sysadmin like me.)

The answer is that the default query step of subqueries is your rule evaluation interval, which is to say the evaluation_interval setting in the global section of your Prometheus configuration file. By default this is one minute but many example configurations set it shorter, which probably carries through into real live setups; ours is 15 seconds, for example.

(The rule evaluation interval is one of the many factors that influences how fast alerts trigger, which may bias you toward a short setting for it even if you don't really use recording rules.)

Given the origin of subqueries as on the fly versions of recording rules (more or less, I'm handwaving a bit here), this default makes a certain amount of sense. After all, if you made your subquery into a recording rule, it would be evaluated at every rule evaluation interval. If you don't express any opinion on the subquery resolution (by providing one explicitly), Prometheus might as well behave as if it was a recording rule and evaluate it at the same frequency.

I don't know if this makes me more or less likely to rely on the default subquery resolution when I'm using them for on the fly checks of things like how long our NTP servers go without updating their system clocks. I'm going to have to think about it, but certainly for certain sorts of subquery usage it's the right choice.

(But those subqueries are another entry.)

sysadmin/PrometheusSubqueriesDefaultStep written at 23:35:35; Add Comment

Turning something into a script encourages improving it

I've written before (a long time before) that scripting something captures knowledge about how to do it, but recently I had a nice example of another advantage of turning things into scripts, which is that scripting things can encourage you to improve what you're doing.

We've recently started using Dovecot statistics to hunt for causes of IMAP slowness, including people who are still part of our backwards compatibility issue and would benefit from being migrated to our new way of dealing with this, both because their clients would often work much faster and because it helps the overall IMAP server. I wrote an initial script to post-process the raw Dovecot stats to produce some pretty general output and then found myself repeatedly running it by hand with a somewhat intricate command line that post-processed its output a bit to cut things down to what I really cared about, such as IMAP LIST commands with wildcards. Eventually I figured that I was running this command line often enough that I should turn it into a little script of its own.

The very first version of this script was just what I'd been running on the command line, but I'd hardly saved my editor buffer before I was adding small improvements; I switched to dynamically generating some data that had been stashed in a file before, and I filtered out a couple more things that weren't interesting to me. I could have done all of these in the command line version, but it would have pushed the command line well over the edge of complexity that I want to write (and re-write) by hand, at least these days (I once was more energetic here).

I think that there are at least two reasons why turning my command line into a script encouraged me to improve it. The first is that it was much easier to reuse my improvements because they're automatically permanent. An improved command line is an ephemeral thing (especially for me, as I don't keep command history across sessions); embodied in a script, it's forever. The second is that it's simply easier to do things in a script. A command line is edited on the fly and either is crammed all into one giant line or generally gives you relatively little or no ability to go back to improve a previous line. A shell script is editing in the editor of your choice, and you have the full power of the editor to move around, reconsider things, split things up into much more readable multi-line constructs, and so on. As part of this, it's simply far easier and more feasible to use more complex structures in a script than on the command line, things like case in the Bourne shell.

(I could in theory write a Bourne shell case as part of a command line, but in practice the idea is laughable. I'll use a case in even a small shell script without thinking twice about it.)

What this experience suggests to me is that I should consider being more aggressive about turning frequent command line things into scripts. Perhaps I don't want to have them on my $PATH, because it could get very cluttered that way, but I can at least make a habit of stashing them in a directory somewhere.

sysadmin/ScriptsPromptImprovements written at 01:08:10; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.