Settings

Theme

Observability Stack – AI First?

1 points by jblake 14 days ago · 4 comments · 3 min read


So... I'm way behind. Just got into Claude this week... it's already doing most of my coding and bug fixes. Crazy stuff.

Some background on my company: Mature (14 yrs) Ruby on Rails app, Sidekiq, Redis, PG, AWS lambda/eventbridge, react/preact, swift, and others. Hosted on Heroku. Very database heavy. Solo guy, owner/operator.

Current stack:

Datadog logging (dabbled in APM, metrics, and others, but the build pack for Heroku is so bloated I had to remove it), so now it's a simple Heroku log drain to datadog. I really miss the statsd host and APM stuff... but it took my slug size from ~250mb to over 350 and made booting and deploys much slower. I'd also like to get off Heroku.... some day... but I can't event fathom it.

Bugsnag for errors. Already moved this to Sentry to try out the AI stuff.

A little fed up with Datadog and receiving $600 monthly bills on top of my enterprise commitment. biggest pain points: indexing is priced per line, not by gb (discourages me from simple logging, so I resort to putting events into a single request/job log line, then have complex chunking of messages to split them up so that Heroku log drains don't destroy the json).

The complexity of configuring datadog was a pain point, but AI has mostly solved that.

I want an all in one tool... it seems that would be best for AI to have all the context in one place. However, sentry logs is pretty green, slow, and doesn't do much. I tried BetterStack, again, pretty basic. Pricing is similar to datadog anyway. I do about 600gb + 50M lines a month.

So I've come to the conclusion that Datadog logs are still so much better than anything out there. But Sentry is quite a bit better for errors (I also have other projects: iOS app, react, vanilla JS).

Any tips? I love putting time into this stuff, it makes me so happy when I have to fix something and it's all there, but at some point - like the reason for using Heroku - I have a lot of other stuff I need to focus on.

The main reason for exploring a switch is to automate root cause analysis, bug fixing, performance improvements, etc. Im still a noobie with Claude, right now I'm just using it as a pair programmer in Claude Code vscode chat, sometimes CLI.

chickensong 14 days ago

Are you asking about other 3rd party solutions, or thinking of DIY? AFAIK, Datadog is still the best/most expensive. Any provider offering AI analysis is just going to add to your cost.

> I love putting time into this stuff

> like the reason for using Heroku - I have a lot of other stuff I need to focus on

You probably need to pick a side here. Building a robust solution with even a fraction of Datadog and Sentry features isn't for the faint of heart. Unless you're into infrastructure, it might be a painful time sink.

That said, I fully promote DIY, so maybe some kind of OTel, wide log, ClickHouse, Grafana thing with a GPU for an AI sidecar VM.

  • jblakeOP 14 days ago

    Yeah that sounds like a lot of fun, but alas no time for that. I think I need to stick with Datadog logs and use Sentry for errors. I was excited to try to get everything into one tool, but that doesn't seem realistic. Datadog is just so much better at logging. And figure out my AI workflows. Sentry's Seer is interesting, but it seems unnecessary - why not send everything straight to Claude.

verdverm 14 days ago

OTEL, that's the industry standard

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection