In practice, Grafana has not been great at backward compatibility

May 26, 2023

We started our Prometheus and Grafana based metrics setup in late 2018. Although many of our Grafana dashboards weren't created immediately, the majority of them were probably built by the middle of 2019. Based on release history, we probably started somewhere around v6.4.0 and had many dashboards done by the time v7.0.0 came out. We're currently frozen on v8.3.11, having tried v8.4.0 and rejected it and all subsequent versions. The reason for this is fairly straightforward; from v8.4.0 onward, Grafana broke too many of our dashboards. The breakage didn't start in 8.4, to be honest. For us, things started to degrade from the change between the 7.x series and 8.0, but 8.4 was the breaking point where too much was off or not working.

(I've done experiments with Grafana v9.0 and onward, and it had more issues over the latest 8.x releases. In one way this isn't too surprising, since it is a new major release.)

I've encountered issues in several areas in Grafana during upgrades. Grafana's handling of null results from Prometheus queries has regressed more than once while we've been using it. Third party panels that we use have been partially degraded or sometimes completely broken (cf). Old panel types sprouted new bugs; new panel types that were supposed to replace them had new bugs, or sometimes lacked important functionality that the old panel types had. Upgrading (especially automatically) from old panel types to their nominally equivalent new panel types didn't always carry over all of your settings (for settings the new panel type supported, which wasn't always all of them).

Grafana is developed and maintained by competent people. That these backward compatibility issues happen anyway tell me that broad backward compatibility is not a priority in Grafana development. This is a perfectly fair thing; the Grafana team is free to pick their priorities (for example, not preserving compatibility for third party panels if they feel the API being used is sub-par and needs to change). But I'm free to quietly react to them, as I have by freezing on 8.3.x, the last release where things worked well enough.

I personally think that Grafana's periodic lack of good backward compatibility is not a great thing. Dashboards are not programs, and I can't imagine that many places want them to be in constant development. I suspect that there are quite a lot of places that want to design and create their dashboards and then have them just keep working until the metrics they draw on change (forcing the dashboards to change to keep up). Having to spend time on dashboards simply to keep them working as they are is not going to leave people enthused, especially if the new version doesn't work as well as the old version.

The corollary of this is that I think you should maintain a testing Grafana server, kept up to date with your primary server's dashboards, where you can apply Grafana updates to test them to see if anything you care about is broken or sufficiently different to cause you problems. You should probably also think about what might happen if you have to either freeze your version of Grafana or significantly rebuild your dashboards to cope with a new version. If you allow lots of people to build their own dashboards, perhaps you want to consider how to reach out to them to get them to test their dashboards or let them know of issues you've found and the potential need to update their dashboards.

(I didn't bother filing bug reports about the Grafana issues that I encountered, because my experience with filing other Grafana issues was that doing so didn't produce results. I'm sure that there are many reasons for this, including that Grafana probably gets a lot of issues filed against it.)

Written on 26 May 2023.
« That people produce HTML with string templates is telling us something
How I set up a server for testing new Grafana versions and other things »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri May 26 22:28:16 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.