Another way to do easy configuration for lots of Prometheus Blackbox checks

September 2, 2019

Early on in our use of Prometheus, I wrote up a scheme for easy configuration of lots of Blackbox checks where I encoded the name of the Blackbox module to use in the names of the targets you configured, and then extracted them with relabeling. The result gave you target names that looked like:

 - ssh_banner,somehost:22
 - http_2xx,http://somewhere/url

This encodes the Blackbox module before the comma and the actual Blackbox target after it (you can use any suitable separator; I picked comma for how it looks).

This works, but I've learned that there is another approach that is more natural and perhaps clearer, namely adding explicit additional labels to your targets and then using those labels in relabeling to determine things like the Blackbox module or even the target to check.

Let's start with the basics (since I didn't know this for a while), which is that a Prometheus 'targets' section of statically configured targets can have additional labels specified. The ostensible purpose of this (covered in the documentation) is to attach additional labels to all metrics scraped from the targets:

- targets:
  - 1.1.1.1:53
  - 8.8.8.8:53
  labels:
  - type: external

(My initial use of this was to explicitly label some of the hosts we check as off-network hosts, because check failure for them is different from failure for our local machines.)

However, as covered in this prometheus-users message from Ben Kochie, these additional labels are available at the start of the scrape, and so you can use relabeling to turn them into things like what Blackbox module to use. For example, suppose you add a 'module: ssh_banner' label to a set of targets that you want checked with that Blackbox module, and then have a relabeling configuration like the following:

# Set the target from the address,
# as usual
- source_labels: [__address__]
  target_label: __param_target

# Set the Blackbox module from
# the 'module:' label
- source_labels: [module]
  target_label: __param_module

# And now point the address to a
# local Blackbox as usual.
- target_label: __address__
  replacement: 127.0.0.1:9115

(As a disclaimer, I haven't actually tested this snippet.)

I see advantages and disadvantages to this approach. One advantage is that it's likely to be more clearer and normal. People are (or should be) used to attaching extra labels to static targets, and it's clearly documented, so the only magic and mystery is how your additional module label takes effect. While I like my original syntax, it's clearly more magical and unusual; you're going to have to read the relabeling configuration to understand what's going on and how to write additional things.

One drawback is that it pretty much forces you to group checks by module instead of by target. With my scheme, you can list several checks for a host together:

- targets:
  [...]
  - ssh_banner,host:22
  - smtp_banner,host:25
  - http_2xx,http://host/url

With an explicit label-based approach to selecting the module, each of these has to be in a separate static configuration section because they each need a different module label. On the other hand, this pushes you toward listing all of your checks for a given Blackbox module in one spot.

A place where this can be an active drawback is if you need to vary additional labels for groups of targets, especially across modules. For instance, if you want to attach a 'dc' label to all Blackbox metrics from a group of hosts, you now need to split up those per module sections (with a 'module' label) into multiple sections, one for each combination of module and dc. This could easily get pretty verbose (although it might not matter if you're automatically generating this from external configuration information).

I probably won't be changing our configuration from my current trick to this more straightforward approach, but I'm going to bear it in mind for future use. Partly this is because our setup already exists and works, and partly it's because we use some additional labels now and I want to preserve our freedom to easily use more in the future.

Written on 02 September 2019.
« Some limitations of wifi MAC address randomization
Using Wireshark's Statistics menu to get per-host traffic volume »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Sep 2 22:41:03 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.