Wandering Thoughts archives

2021-02-24

How convenience in Prometheus labels for alerts led me into a quiet mistake

In our Prometheus setup, we have a system of alerts that are in testing, not in production. As I described recently, this is implemented by attaching a special label with a special value to each alert, in our case a 'send' label with the value of 'testing'; this is set up in our Prometheus alert rules. This is perfectly sensible.

In addition to alerts that are in testing, we also have some machines that aren't in production or that I'm only monitoring on a test basis. Because these aren't production machines, I want any alerts about these machines to be 'testing' alerts, even though the alerts themselves are production alerts. When I started thinking about it, I realized that there was a convenient way to do this because alert labels are inherited from metric labels and I can attach additional labels to specific scrape targets. This means that all I need to do to make all alerts for a machine that are based on the host agent's metrics into testing alerts is the following:

- targets:
    - production:9100
  [...]

- labels:
    send: testing
  targets:
    - someday:9100

I can do the same for any other checks, such as Blackbox checks. This is quite convenient, which encourages me to actually set up testing monitoring for these machines instead of letting them go unmonitored. But there's a hidden downside to it.

When we promote a machine to production, obviously we have to make alerts about it be regular alerts instead of testing alerts. Mechanically this is easy to do; I move the 'someday:9100' target up to the main section of the scrape configuration, which means it no longer gets the 'send="testing"' label on its metrics. Which is exactly the problem, because in Prometheus a time series is identified by its labels (and their values). If you drop a label or change the value of one, you get a different time series. This means that the moment we promote a machine to production, it's as if we dropped the old pre-production version of it and added a completely different machine (that coincidentally has the same name, OS version, and so on).

Some PromQL expressions will allow us to awkwardly overcome this if we remember to use 'ignoring(send)' or 'without(send)' in the appropriate place. Other expressions can't be fixed up this way; anything using 'rate()' or 'delta()', for example. A 'rate()' across the transition boundary sees two partial time series, not one complete one.

What this has made me realize is that I want to think carefully before putting temporary things in Prometheus metric labels. If possible, all labels (and label values) on metrics should be durable. Whether or not a machine is an external one is a durable property, and so is fine to embed in a metric label; whether or not it's in testing is not.

Of course this is not a simple binary decision. Sometimes it may be right to effectively start metrics for a machine from scratch when it goes into production (or otherwise changes state in some significant way). Sometimes its configuration may be changed around in production, and beyond that what it's experiencing may be different enough that you want a clear break in metrics.

(And if you want to compare the metrics in testing to the metrics in production, you can always do that by hand. The data isn't gone; it's merely in a different time series, just as if you'd renamed the machine when you put it into production.)

sysadmin/PrometheusHostLabelMistake written at 23:01:31; Add Comment

How (and where) Prometheus alerts get their labels

In Prometheus, you can and usually do have alerting rules that evaluate expressions to create alerts. These alerts are usually passed to Alertmanager and they are visible in Prometheus itself as a couple of metrics, ALERTS and ALERTS_FOR_STATE. These metrics can be used to do things like find out the start time of alerts or just display a count of currently active alerts on your dashboard. Alerts almost always have labels (and values for those labels), which tend to be used in Alertmanager templates to provide additional information along side annotations, which are subtly but crucially different.

All of this is standard Prometheus knowledge and is well documented, but what doesn't seem to be well documented is where alert labels come from (or at least I couldn't find it said explicitly in any of the obvious spots in the documentation). Within Prometheus, the labels on an alert come from two places. First, you can explicitly add labels to the alert in the alert rule, which can be used for things like setting up testing alerts. Second, the basic labels for an alert are whatever labels come out of the alert expression. This can have some important consequences.

If your alert expression is a simple one that just involves basic metric operations, for example 'node_load1 > 10.0', then the basic labels on the alert are the same labels that the metric itself has; all of them will be passed through. However, if your alert expression narrows down or throws away some labels, then those labels will be missing from the end result. One of the ways to lose metrics in alert expressions is to use 'by (...)', because this discards all labels other than the 'by (whatever)' label or labels. You can also deliberately pull in labels from additional metrics, perhaps as a form of database lookup (and then you can use these additional labels in your Alertmanager setup).

Prometheus itself also adds an alertname label, with the name of the alert as its value. The ALERTS metric in Prometheus also has an alertstate label, but this is not passed on to the version of the alert that Alertmanager sees. Additionally, as part of sending alerts to Alertmanager, Prometheus can relabel alerts in general to do things like canonicalize some labels. This can be done either for all Alertmanager destinations or only for a particular one, if you have more than one of them set up. This only affects alerts as seen by Alertmanager; the version in the ALERTS metric is unaffected.

(This can be slightly annoying if you're building Grafana dashboards that display alert information using labels that your alert relabeling changes.)

PS: In practice, people who use Prometheus work out where alert labels come from almost immediately. It's both intuitive (alert rules use expressions, expression results have labels, and so on) and obvious once you have some actual alerts to look at. But if you're trying to decode Prometheus on your first attempt, it and the consequences aren't obvious.

sysadmin/PrometheusAlertsWhereLabels written at 00:19:23; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.