Debugging distributed database mysteries with Rust, packet capture and Polars
questdb.ioCould the same be achieved with less work with distributed tracing?
Article author here..
Instrumenting would have been 100% just as feasible. Ironically it would have been more work.
For context, our DB is highly optimised for ingestion (millions of rows / second) and adding any high resolution metrics there would impact performance, so would have to be either ripped out afterwards or engineered very carefully (read "not cheap") as to not impact performance.
This stuff took an afternoon, is reusable, and frankly, was more fun to implement :-).
I suspect there are tools out there that that do this stuff, of course. The matter is still finding and learning how to use them compared to writing a few hundred lines of code.
This is the kind of learning experience that causes you to log payload size metrics