Why I prefer the script exporter for exposing script metrics to Prometheus

January 5, 2020

Suppose that you have some scripts that you use to extract and generate Prometheus metrics for targets, and these scripts run on your Prometheus server. These metrics might be detailed SNTP metrics of (remote) NTP servers, IMAP and POP3 login performance metrics, and so on. You have at least three methods to expose these script metrics to Prometheus; you can run them from cron and publish through either node_exporter's textfile collector or Pushgateway, or you can use the third part script_exporter to run your scripts in response to Prometheus scrape requests (and return their metrics). Having used all three methods to generate metrics, I've come to usually prefer using the script exporter except in one special case.

Conceptually, in all three methods you're getting metrics from some targets. In the cron-based methods, what targets you're getting what metrics from (and how frequently) is embedded in and controlled by scripts, cron.d files, and so on, not in your Prometheus configuration the way your other targets are. In the script exporter method, all of that knowledge of targets and timing is in your Prometheus configuration, just like your other targets. And just like other targets, you can configure additional labels on some of your script exporter scrapes, or have different timings, or so on, and it's all controlled in one place. If some targets need some different checking options, you can set that in your Prometheus configuration as well.

You can do all of this with cron based scripts, but you start littering your scripts and cron.d files and so on with special cases. If you push it far enough, you're basically building your own additional set of target configurations, per-target options, and so on. Prometheus already has all of that ready for you to use (and it's not that difficult to make it general with the usual tricks, or the label based approach).

There are two additional benefits from directly scraping metrics. First, the metrics are always current instead of delayed somewhat by however long Prometheus takes to scrape Pushgateway or the host agent. Related to this, you get automatic handling of staleness if something goes wrong and scrapes start failing. Second, you have a directly exposed metric for whether the scrape worked or whether it failed for some reason, in the form of the relevant up and script_success metrics. With indirect scraping you have to construct additional things to generate the equivalents.

The one situation where this doesn't work well is when you want a relatively slow metric generation interval. Because you're scraping directly, you have the usual Prometheus limitation where it considers any metric more than five minutes old to be stale. If you want to do your checks and generate your metrics only once every four or five minutes or slower, you're basically stuck publishing them indirectly so that they won't regularly disappear as stale, and this means one of the cron-based methods.

Written on 05 January 2020.
« Three ways to expose script-created metrics in Prometheus
How I move files between iOS devices and Unix machines (using SSH) »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jan 5 01:57:07 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.