A YAML syntax surprise and trick in Prometheus Alertmanager configuration
In a comment on my entry on doing reboot notifications with Prometheus, Simon noted:
Just a note to say that since Alertmanager v0.16.0, it is possible to group alerts by all labels using "group_by: [...]".
When I saw this syntax in the comment, my eyebrows went up, because
I'd never seen any sort of
... syntax in YAML before; I had no
idea it was even a thing you could do in YAML, and I didn't know
what it really meant. Was it some special syntax that flagged what
would normally be a YAML array for special processing, for example?
So I scurried off to the Wikipedia YAML entry, then the official YAML site
and the specification, and finally the Alertmanager
source code (because sometimes I'm a systems programmer).
As it turns out this is explained (more or less) in the current Alertmanager documentation, if you read all of the words. Let me quote them:
To aggregate by all possible labels use the special value '...' as the sole label name, for example:
However, the other part of this documentation is less clear, since it lists things as:
[ group_by: '[' <labelname>, ... ']' ]
What is actually going on here is that although the
like YAML syntax, it's actually just a YAML string. The group_by
setting is an array of (YAML) strings, which are normally the
Prometheus labels to group by, but if you use the string value '...'
all by itself, Alertmanager behaves specially. This can be written
in a way that looks like syntax instead of a string because YAML
allows a lot of unquoted things to be taken as strings (what YAML
(I'm honestly not sure when you have to quote a YAML string.)
The way that Alertmanager documents this makes it reasonably clear that the '...' is an unusual label, not a bit of YAML syntax, since the documentation both explicitly says so and shows it in quoted form (except in a place where the quotes sort of have a different meaning). However, writing it without the explicit quotes makes things much more confusing unless you're already in tune enough with YAML to get what's going on.
My suspicion is that a lot of people aren't going to be that in tune with YAML, partly because YAML is complex, which makes it easy to believe that there's some aspect of YAML syntax you don't know or don't remember. Certainly this experience has reinforced my view that I should be as explicit as possible in our Prometheus YAML usage, even if it's not necessary under the rules. I should also use a consistent style about whether some things are always quoted or not, instead of varying it around for individual rules, configuration bits, and so on.
(Also I should generally avoid any clever YAML things unless I absolutely have to use them.)