blog.8-p.info

Last week, I was writing a document that utilizes OpenTelemetry, sent that to Michael Hausenblas and Jaana Dogan, somewhat out of blue. Both kindly reviewed the document and corrected misunderstanding I had.

So, now I have better understanding regarding OpenTelemetry!

What is OpenTelemetry?

According to the official website;

An observability framework for cloud-native software.

OpenTelemetry is a collection of tools, APIs, and SDKs. You can use it to instrument, generate, collect, and export telemetry data (metrics, logs, and traces) for analysis in order to understand your software’s performance and behavior.

But it is not a “framework” in the sense of Rails or Django. It wouldn’t take your “main” function. “A collection of tools, APIs, and SDKs” makes much more sense, but it is really all-around.

OpenTelemetry Protocol

First there is OpenTelemetry Protocol (OTLP). It is a wire protocol over either gRPC or HTTP/1.1. Unlike Prometheus, where you can expose an endpoint (in other words, your application listens a TCP port). OpenTelemetry protocol is “push”. So Your application would send logging, metrics and tracing data to an OpenTelemetry endpoint somewhere, either on localhost or remote.

OpenTelemetry Collector

Then there is OpenTelemetry Collector, which is an implementation that can send/receive logging, metrics and tracing data in various formats, including the OpenTelemetry Protocol.

Basically the collector is acting like fluentd. It is pluggable and can be configured in a lot of different ways.

Honestly speaking, having this pluggable collector under OpenTelemetry umbrella confused me in the beginning. You could configure the collector to take tracing information from Jaeger and publish the information to Zipkin. At least on the configuration-level, OpenTelemetry Protocol is treated as just “one of them”. Don’t they want to push OpenTelemetry as a protocol?

Is it a good idea to use TCP in localhost-only communications?

I don’t know. Assigning TCP ports during development is tedious though.

What does it mean for containerd?

I don’t know either.

There is an open draft pull request to add OpenTelemetry for tracing. The author is actually my colleague. I have also opened an issue about utilizing OpenTelemetry for metrics in addition to Prometheus.

Regarding logging, I actually like the fact systemd takes care of logging. Adding all daemons OpenTelemetry logging support seems to be going backward.