Settings

Theme

How the open source Caddy server uses Grafana Cloud for full-stack observability

grafana.com

36 points by m_sahaf 2 years ago · 6 comments

Reader

ttymck 2 years ago

I'd have preferred more technical details. But it's interesting to get a sense of the rather "cowboy" (in a positive sense) approach of the caddy maintainers for their "online" service offering (a build server).

  • m_sahafOP 2 years ago

    > I'd have preferred more technical details.

    There isn't much technical details to it. Caddy exposes profiles by default[0] and prometheus metrics are available as opt-in. We set up grafana-agent to collect profiles and metrics from Caddy, poked at the Grafana Cloud portal, studied the available data, and checked the charts for anomalies. Grafana Cloud made it easy for us to get started with that without having to build more infrastructure for them, which will also require extra energy that can be better spent on the core of our project.

    [0] https://twitter.com/MohammedSahaf/status/1760415991513637137

    • starkparker 2 years ago

      The biggest problem in the observability space is people assuming that other people know what any of this entails.

      What "isn't much technical details" to you is required information for way more people than o11y wonks (not you, but more Grafana Cloud and some OTEL champions etc.) seem to believe, especially around cloud/microservice metrics.

      Prometheus and PromQL especially are opaque and hard to start using. The bar of entry to understanding how it can solve problems is higher than it looks from the inside, especially when you have account managers in your ear making sure _you_ know.

      • m_sahafOP 2 years ago

        Fair enough. There was a learning curve which we had to overcome. I guess you can blame the curse of knowledge[0] for not making this part of the blog post, or because I was more focused on the results delivered by the Grafana stack than the process. I wonder if it may be something like math, where you have to practice much enough for it to click.

        [0] https://en.wikipedia.org/wiki/Curse_of_knowledge

notfunny 2 years ago

Is this a joke? You can't even use Caddy's metrics in production without a serious performance impact

https://github.com/caddyserver/caddy/issues/4644

  • m_sahafOP 2 years ago

    We have it enabled in production. Checkmate.

    You're welcome to help through numerous means.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection