OpenTelemetry for Go: Measuring overhead costs

coroot.com

129 points by openWrangler 15 days ago


sa46 - 15 days ago

Funny timing—I tried optimizing the Otel Go SDK a few weeks ago (https://github.com/open-telemetry/opentelemetry-go/issues/67...).

I suspect you could make the tracing SDK 2x faster with some cleverness. The main tricks are:

- Use a faster time.Now(). Go does a fair bit of work to convert to the Go epoch.

- Use atomics instead of a mutex. I sent a PR, but the reviewer caught correctness issues. Atomics are subtle and tricky.

- Directly marshal protos instead of reflection with a hand-rolled library or with https://github.com/VictoriaMetrics/easyproto.

The gold standard is how TiDB implemented tracing (https://www.pingcap.com/blog/how-we-trace-a-kv-database-with...). Since Go purposefully (and reasonably) doesn't currently provide a comparable abstraction for thread-local storage, we can't implement similar tricks like special-casing when a trace is modified on a single thread.

reactordev - 15 days ago

Mmmmmmm, the last 8 months of my life wrapped into a blog post but with an ad on the end. Excellent. Basically the same findings as me, my team, and everyone else in the space.

Not being sarcastic at all, it’s tricky. I like that the article called out eBPF and why you would want to disable it for speed but recommends caution. I kept hearing from executives a “single pane of glass” marketing speak and I kept my mouth shut about how that isn’t feasible across the entire organization. Needless to say, they didn’t like that non-answer and so I was canned. What an engineer cared about is different from organization/business metrics and often the two were confused.

I wrote a lot of great otel receivers though. VMware, Veracode, Hashicorp Vault, GitLab, Jenkins, Jira, and the platforms itself.

jeffbee - 15 days ago

I feel like this is a lesson that unfortunately did not escape Google, even though a lot of these open systems came from Google or ex-Googlers. The overhead of tracing, logs, and metrics needs to be ultra-low. But the (mis)feature whereby a trace span can be sampled post hoc means that you cannot have a nil tracer that does nothing on unsampled traces, because it could become sampled later. And the idea that if a metric exists it must be centrally collected is totally preposterous, makes everything far too expensive when all a developer wants is a metric that costs nothing in the steady state but can be collected when needed.

coxley - 15 days ago

The OTel SDK has always been much worse to use than Prometheus for metrics — including higher overhead. I prefer to only use it for tracing for that reason.

Thaxll - 15 days ago

Logging, metrics and traces are not free, especially if you turn them on at every requests.

Tracing every http 200 at 10k req/sec is not something you should be doing, at that rate you should sample 200 ( 1% or so ) and trace all the errors.

nfrankel - 14 days ago

I have a talk on OpenTelemetry that I regularly present at conferences. After it, I often get the question: "But what's the performance overhead?". In general, I answer by another question: "Is it better to go fast blindfolded or slightly slower with full visibility?". Then I advise the person to do their own performance test in their specific context.

I'm very happy somebody took the time to measure it.

dmoy - 15 days ago

Not on original topic, but:

I definitely prefer having graphs put the unit at least on the axis, if not in the individual axis labels directly.

I.e. instead of having a graph titled "latency, seconds" at the top and then way over on the left have an unlabeled axis with "5m, 10m, 15m, 20m" ticks...

I'd rather have title "latency" and either "seconds" on the left, or, given the confusion between "5m = 5 minutes" or "5m = 5 milli[seconds]", just have it explicitly labeled on each tick: 5ms, 10ms, ...

Way, way less likely to confuse someone when the units are right on the number, instead of floating way over in a different section of the graph

vanschelven - 15 days ago

The article never really explains what eBPF is -- AFAIU, it’s a kernel feature that lets you trace syscalls and network events without touching your app code. Low overhead, good for metrics, but not exactly transparent.

It’s the umpteenth OTEL-critical article on the front page of HN this month alone... I have to say I share the sentiment but probably for different reasons. My take is quite the opposite: most value is precisely in the application (code) level so you definetly should instrument... and then focus on Errors over "general observability"[0]

[0] https://www.bugsink.com/blog/track-errors-first/

otterley - 15 days ago

Out of curiosity, does Go's built-in pprof yield different results?

The nice thing about Go is that you don't need an eBPF module to get decent profiling.

Also, CPU and memory instrumentation is built into the Linux kernel already.

- 15 days ago
[deleted]
jiggawatts - 14 days ago

A standard trick is to only turn on detailed telemetry from a subset of identical worker VMs or container instances.

Sampling is almost always sufficient for most issues, and when it’s not, you can turn on telemetry on all nodes for selected error levels or critical sections.

baalimago - 14 days ago

What's the performance drop on Prometheus?