I wish Prometheus had a table-driven label remapping feature
We operate a variety of websites and services that are known to users under general names (such as the website 'support.cs') but which are implemented by specific, known machines, such as our primary web server. When we do Blackbox external checks on these services, we have to do it under their general name, and by default this generic name will flow through to our "host" label. In turn, this means that if something happens to a machine (such as its Apache stopping responding), by default we'll get a number of alerts about different nominal hosts.
The general Prometheus solution to this is relabeling. You can do this either as the metrics are ingested from Blackbox probes or as the alerts are sent to Alertmanager. However, right now doing this in bulk is awkward. If you have a bunch of services that are implemented by a bunch of different machines, what you wind up with is a bunch of relabeling rules that look like:
- source-labels: ["host"] regex: "(name1|othername|virtual2)" replacement: realhost1 target_label: host
You have to have one of these for each real host with its list of services.
What I wish for from Prometheus is some way to do a table based lookup for label remapping, where you could list a bunch of source matches and target values. Then we could handle all of these service name to real host remappings in one place, with one rewrite rule and ideally one table in a file.
(In our case, canonicalization by reverse lookup of the target IP isn't sufficient in all cases, because some services are deliberately offered on IP aliases of the relevant hosts.)
Sadly, I suspect that this is a sufficiently obscure or unpopular
usage that Prometheus isn't likely to support it. There's also no
obvious syntax or small feature addition that could do it, especially
if you want to use a file for the mapping table. YAML does have a
syntax for maps (aka dictionaries), so you could at least write an
inline regex_map YAML map that had a bunch of regexs as the keys
and then replacements as the values, but that doesn't fit nicely in
with how the
replacement attribute is defined.
PS: If you have to do this manually, with a bunch of specific relabeling rules, I think it's more maintainable to do this in alert relabeling. Otherwise you may have to replicate all of your relabeling across different scrape jobes, for example if you use both Blackbox and a script exporter, as we do.
Sidebar: The per-target label approach is worse
The alternate approach is to define labels with targets. Unfortunately the resulting YAML is relatively terrible, at least in my view. You can't just define some labels on a per-target basis; instead, you have to break things up into awkward blocks of
- labels: host: realhost1 targets: - http://something1/ - something2 - labels: host: realhost2 targets: - https://service/ - aservice:2048
Do you want all of your HTTP and HTTPS checks in one place so you can keep track of all of them? With applying labels to targets in the configuration, you're out of luck.
Life gets worse if you're using both Blackbox and a script exporter to probe hosts, because you get to repeat these mappings in two files. Or more, if you have other parameters you want to vary.