2022-06-16
I wish Grafana dashboards and panels could have easy, natural comments
Recently I was looking at a panel in one of our Grafana dashboards and noticed that its PromQL queries used avg_over_time() when it (now) felt as if max_over_time() was what the panel should be using. It's been years since I created this panel and last touched it, and I definitely no longer remember what I was thinking at the time. Did I have a good reason that avg_over_time() was necessary, or did an average just feel more correct for the purpose of the panel at the time I created it?
We have Prometheus alert rules with similarly tangled PromQL expressions, but since Prometheus alert rules are (normally) configured in YAML text files that allow comments, most of our complicated and non-obvious alert rules have commentary about why they're that way, what options don't work, and so on. This commentary has periodically been extremely helpful for refreshing my mind about what on earth past me was thinking when he wrote the rule.
Setting up Grafana panels and dashboards is normally done through their web GUI, which doesn't really offer any good way of writing this sort of commentary. In theory the GUI could offer space and options for this; in practice, such things would probably be considered clutter by designers (and they would sort of be right). Writing comments isn't anywhere near as natural in a GUI environment as it is in text.
(There are ways of provisioning Grafana from text files, but they aren't the natural way to use Grafana. One sign of this is that Grafana natively stores your dashboards and panels in a SQLite database instead of a more externally editable format.)
I don't know if I'd written a comment about this particular PromQL query even if Grafana had comments; I might have considered it obvious at the time and not done so. But at least it would be more likely, and if I hadn't left a comment back then, I probably would now (now that I've stubbed my foot on this).
Basically, everything that you can configure in any complex way should have comments. Grafana dashboards and panels definitely count.
(I was looking at this panel for unhappy reasons.)