Datadog Is Having Issues
isdown.appI've been a proponent of Datadog usage over building inferior stuff with Grafana, Prometheus, ELK & it's ilk.
But I never thought that Datadog could be 8+ hours down on such integral parts as monitoring and logs.
I really like Grafana. Much prefer it over the chaos that is the Datadog UI.
Looks like an extended global outage covering all regions apart from GovCloud:
Still getting "500 Internal Server Error, see context.apiResponse for more details" on most monitors in event history..
Seems to have started with the web interface being down, and then got progressively worse, now affecting everything.
Anyone taking bets on the cause?
I have my money on the good old "configuration error".
Some people may be having the quietest on-call of their lives.
also on the official datadog status page https://status.datadoghq.com/