I have divided (and partly uninformed) views on OpenTelemetry

June 3, 2025

OpenTelemetry ('OTel') is one of the current in things in the broad metrics and monitoring space. As I understand it, it's fundamentally a set of standards (ie, specifications) for how things can emit metrics, logs, and traces; the intended purpose is (presumably) so that people writing programs can stop having to decide if they expose Prometheus format metrics, or Influx format metrics, or statsd format metrics, or so on. They expose one standard format, OpenTelemetry, and then everything (theoretically) can consume it. All of this has come on to my radar because Prometheus can increasingly ingest OpenTelemetry format metrics and we make significant use of Prometheus.

If OpenTelemetry is just another metrics format that things will produce and Prometheus will consume just as it consumes Prometheus format metrics today, that seems perfectly okay. I'm pretty indifferent to the metrics formats involved, presuming that they're straightforward to generate and I never have to drop everything and convert all of our things that generate (Prometheus format) metrics to generating OpenTelemetry metrics. This would be especially hard because OpenTelemtry seems to require either Protobuf or (complex) JSON, while the Prometheus metrics format is simple text.

However, this is where I start getting twitchy. OpenTelemetry certainly gives off the air of being a complex ecosystem, and on top of that it also seems to be an application focused ecosystem, not a system focused one. I don't think that metrics are as highly regarded in application focused ecosystems as logs and traces are, while we care a lot about metrics and not very much about the others, at least in an OpenTelemtry context. To the extent that OpenTelemtry diverts people away from producing simple, easy to use and consume metrics, I'm going to wind up being unhappy with it. If what 'OpenTelemtry support' turns out to mean in practice is that more and more things have minimal metrics but lots of logs and traces, that will be a loss for us.

Or to put it another way, I worry that an application focused OpenTelemetry will pull the air away from the metrics focused things that I care about. I don't know how realistic this worry is. Hopefully it's not.

(Partly I'm underinformed about OpenTelemetry because, as mentioned I often feel disconnected from the mainstream of 'observability', so I don't particularly try to keep up with it.)


Comments on this page:

By Andrew at 2025-06-04 01:17:02:

Using OpenTelemetry doesn't necessarily mean using the OTLP wire protocol. Say you're writing a Go app that has a prometheus metrics endpoint, and you use a third-party library that registers some metrics with otel. In your app you can wire the otel "prometheus exporter" to the prometheus "registry" that your /metrics serves, and the library's metrics will show up, without the library having to care that you use prom. So you get some of that benefit from it as a library rather than as a transport.

On the other hand, otel is one of those designs-by-committee where doing anything with it requires reading a ream of documentation and writing an order of magnitude more code than with straight prom or even OpenCensus, which is unfortunate.

I also find that the otel Go modules tend to be a bit unstable. Though that's mainly due to CoreDNS using the older of the stable versions of Go.

Written on 03 June 2025.
« Things are different between system and application monitoring
Python type checkers work in different ways and can check different things »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Tue Jun 3 22:46:21 2025
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.