Topic: Prometheus and Grafana

This collects most or all of the entries I've written on Prometheus and Grafana, in reverse chronological order. You can also see the overall index of entries (or the chronological index).

2021-06-29: Monitoring the status of Linux network interfaces with Prometheus
2021-06-17: In Prometheus queries, on and ignoring don't drop labels from the result
2021-05-27: Some thoughts on having set up a personal Alertmanager instance
2021-05-22: I don't know how much memory our Prometheus setup needs
2021-05-15: The size of our Prometheus setup as of May 2021
2021-04-12: Counting how many times something started or stopped failing in Prometheus
2021-04-04: Some uses for Prometheus's resets() function
2021-03-31: Understanding Prometheus' changes() function and what it can do for me
2021-03-14: I wish Prometheus had some features to deal with 'missing' metrics
2021-03-13: Prometheus and the case of the stuck metrics
2021-02-24: How convenience in Prometheus labels for alerts led me into a quiet mistake
How (and where) Prometheus alerts get their labels
2021-02-22: How I set up testing alerts in our Prometheus environment
2021-01-10: What timestamps you get back along with Prometheus query results
2021-01-08: How to extract raw time series data from Prometheus
2020-12-30: Some ways to do a Prometheus query as of a given time
A Prometheus wish: easy ways to evaluate a PromQL query at a given time
2020-12-15: In Prometheus, it's hard to work with when metric points happened
2020-12-02: Prometheus 2.23.0 now lets you display graphs in local time
2020-11-17: Grafana and the case of the infinite serial number
2020-10-31: A gotcha with combining single-label and multi-label Prometheus metrics
2020-10-17: A potential Prometheus issue for labeled metrics for infrequent events
2020-08-18: The Prometheus host agent can disturb Linux CPU frequency measurements
2020-08-07: How we choose our time intervals in our Grafana dashboards
2020-07-14: Link: The Anatomy of a PromQL Query
2020-06-29: How Prometheus Blackbox's TLS certificate metrics would have reacted to AddTrust's root expiry
2020-06-25: What Prometheus Blackbox's TLS certificate expiry metrics are checking
2020-06-05: Why we put alert start and end times in our Prometheus alert messages
2020-06-04: Formatting alert start and end times in Prometheus Alertmanager messages
2020-05-22: Working out how frequently your ICMP pings fail in Prometheus
2020-03-30: Notes on Grafana 'value groups' for dashboard variables
2020-03-28: The Prometheus host agent's CPU utilization metrics can be a bit weird
2020-03-19: Make sure to keep useful labels in your Prometheus alert rules
2020-02-29: OpenBSD versus Prometheus (and Go)
2020-02-27: Some alert inhibition rules we use in Prometheus Alertmanager
2020-02-26: The magic settings to make a bar graph in Grafana
2020-01-26: How big our Prometheus setup is (as of January 2020)
2020-01-05: Why I prefer the script exporter for exposing script metrics to Prometheus
2020-01-04: Three ways to expose script-created metrics in Prometheus
2019-12-30: The history and background of us using Prometheus
2019-12-29: Prometheus and Grafana after a year (more or less)
2019-12-28: Our setup of Prometheus and Grafana (as of the end of 2019)
2019-12-02: You can have Grafana tables with multiple values for a single metric (with Prometheus)
Calculating usage over time in Prometheus (and Grafana)
2019-11-30: Counting the number of distinct labels in a Prometheus metric
2019-11-25: In Prometheus, don't be afraid of high cardinality metrics if they're valuable enough
2019-10-08: How we implement reboot notifications when our machines reboot in Prometheus
2019-09-17: Finding metrics that are missing labels in Prometheus (for alert metrics)
2019-09-02: Another way to do easy configuration for lots of Prometheus Blackbox checks
2019-08-26: A lesson of (alert) scale we learned from a power failure
2019-07-28: A note on using the Go Prometheus client package to exposed labeled metrics
2019-06-28: Using Prometheus's statsd exporter to let scripts make metrics updates
2019-06-02: Exploring the start time of Prometheus alerts via ALERTS_FOR_STATE
2019-05-20: Understanding how to pull in labels from other metrics in Prometheus
2019-05-03: Some implications of using offset instead of delta() in Prometheus
2019-04-28: A gotcha with stale metrics and *_over_time() in Prometheus
2019-04-26: Brief notes on making Prometheus instant queries with curl
2019-04-21: My view on upgrading Prometheus (and Grafana) on an ongoing basis
2019-04-18: A pattern for dealing with missing metrics in Prometheus in simple cases
2019-04-13: Remembering that Prometheus expressions act as filters
2019-03-24: Prometheus's delta() function can be inferior to subtraction with offset
2019-03-18: Prometheus subqueries pick time points in a surprising way
2019-03-12: An easy optimization for restricted multi-metric queries in Prometheus
2019-03-11: Testing Prometheus alert conditions through subqueries
2019-03-10: What the default query step is for Prometheus subqueries
2019-03-06: Using Prometheus subqueries to look for spikes in rates
2019-02-27: Using Prometheus subqueries to do calculations over time ranges
2019-02-17: Some notes on heatmaps and histograms in Prometheus and Grafana
2019-01-23: A little surprise with Prometheus scrape intervals, timeouts, and alerts
2018-12-14: Why our Grafana URLs always require HTTP Basic Authentication
2018-12-12: One situation where you absolutely can't use irate() in Prometheus
2018-12-03: Linux disk IO stats in Prometheus
2018-11-25: How we monitor our Prometheus setup itself
2018-11-20: When Prometheus Alertmanager will tell you about resolved alerts
2018-11-11: Easy configuration for lots of Prometheus Blackbox checks
2018-11-10: Why Prometheus turns out not be our ideal alerting system
2018-11-09: Getting CPU utilization breakdowns efficiently in Prometheus
2018-11-05: rate() versus irate() in Prometheus (and Grafana)
2018-10-28: How I'm visualizing health check history in Grafana
2018-10-22: Using group_* vector matching in Prometheus for database lookups
2018-10-18: Some things on delays and timings for Prometheus alerts
2018-10-17: When metrics disappear on updates with Prometheus Pushgateway
2018-10-13: Getting a CPU utilization breakdown in Prometheus's query language, PromQL
How Prometheus's query steps (aka query resolution) work
2018-10-11: Some notes on Prometheus's Blackbox exporter

Last modified: Thu May 27 23:27:38 2021
