Easy configuration for lots of Prometheus Blackbox checks
Suppose, not entirely hypothetically, that you want to do a lot of Prometheus Blackbox checks, and worse, these are all sorts of different checks (not just the same check against a lot of different hosts). Since the only way to specify a lot of Blackbox check parameters is with different Blackbox modules, this means that you need a bunch of different Blackbox modules. The examples of configuring Prometheus Blackbox probes that you'll find online all set the Blackbox module as part of the scrape configuration; for example, straight from the Blackbox README, we have this in their example:
- job_name: 'blackbox' metrics_path: /probe params: module: [http_2xx] [...]
You can do this for each of the separate modules you need to use, but that means many separate scrape configurations and for each separate scrape configuration you're going to need those standard seven lines of relabeling configuration. This is annoying and verbose, and it doesn't take too many of these before your Prometheus configuration file is so overgrown with many Blackbox scrapes that it's hard to see anything else.
(It would be great if Prometheus could somehow macro-ize these or
include them from a separate file or otherwise avoid repeating
everything for each scrape configuration, but so far, no such luck.
You can't even move some of your scrape configurations into a
separate included file; they all have to go in the main
Fortunately, with some cleverness in our relabeling configuration
we can actually embed the name of the module we want to use into
our Blackbox target specification, letting us use one Blackbox
scrape configuration for a whole bunch of different modules. The
trick is that what's necessary for Blackbox checks is that by the
end of setting up a particular scrape, the module parameter is
__param_module label. Normally it winds up there because
we set it in the
param section of the scrape configuration, but
we can also explicitly put it there through relabeling (just as we
__address__ by hand through relabeling).
So, let's start with nominal declared targets that look like this:
- ssh_banner,somehost:25 - http_2xx,http://somewhere/url
This encodes the Blackbox module before the comma and the actual Blackbox target after it (you can use any suitable separator; I picked comma for how it looks).
Our first job with relabeling is to split this apart into the
target URL parameters, which are the magic
relabel_configs: - source_labels: [__address__] regex: ([^,]*),(.*) replacement: $1 target_label: __param_module - source_labels: [__address__] regex: ([^,]*),(.*) replacement: $2 target_label: __param_target
(It's a pity that there's no way to do multiple targets and replacements in one rule, or we could make this much more compact. But I'm probably far from the first person to observe that Prometheus relabeling configurations are very verbose. Presumably Prometheus people don't expect you to be doing very much of it.)
Since we're doing all of our Blackbox checks through a single scrape
configuration, we won't normally be able to easily tell which module
(and thus which check) failed. To make life easier, we explicitly
save the Blackbox module as a new label, which I've called
- source_labels: [__param_module] target_label: probe
Now the rest of our relabeling is essentially standard; we save the
Blackbox target as the
instance label and set the actual address
of our Blackbox exporter:
- source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: 127.0.0.1:9115
All of this works fine, but there turns out to be one drawback of putting all or a lot of your blackbox checks in a single scrape configuration, which is that you can't set the Blackbox check interval on a per-target or per-module basis. If you need or want to vary the check interval for different checks (ie, different Blackbox modules) or even different targets, you'll need to use separate scrape configurations, even with all of the extra verbosity that that requires.
(As you might suspect, I've decided that I'm mostly fine with a lot of our Blackbox checks having the same frequency. I did pull ICMP ping checks out into a separate scrape configuration so that we can do them a lot more frequently.)
PS: If you wanted to, you could go further than this in relabeling;
for instance, you could automatically add the :25 port specification
on the end of hostnames for SSH banner checks. But it's my view
that there's a relatively low limit on how much of this sort of
rewriting one should do. Rewriting to avoid having a massive
prometheus.yml is within my comfort limit here; rewriting just
avoid putting a ':25' on hostnames is not. There is real merit to
being straightforward and sticking as close to normal Prometheus
practice as possible, without extra magic.
(I think that the 'module,real-target' format of target names I've adopted here is relatively easy to see and understand even if you don't know how it works, but I'm biased and may be wrong.)