Datadog raises $31M for cloud monitoring

92 points by clofresh 11 years ago · 14 comments

Reader

Datadog is great. I converted my New Relic monitoring over to Datadog without too much hassle. I was only using the backend monitoring part of New Relic so I didn't miss any of those features. The dashboards are much better, the event stream is really cool, the AWS integrations are great, and the front end is pretty speedy. And the best part, it costs about 5-10x less (which is what prompted the conversion to Datadog in the first place).

rev_bird 11 years ago

New Relic is second to none when it comes to being able to do deep dives into finding out which API call is causing slowdowns where, but Datadog aggregates data better than anything else I've ever used -- to me, the monitoring part is just a nice side-feature.
- socialist_coder 11 years ago
  
  Only if you are running a supported language / server. That feature doesn't really work at all for .NET web-api apps =(
  - willxu 11 years ago
    
    Product manager from Datadog here. Better .Net (and Windows) support is coming.
    
    socialist_coder 11 years ago
    
    To be clear, I was talking about the automatic "deep dives into finding out which API call is causing slowdowns" part of New Relic. That is a useful feature, but it doesn't work at all on .NET web-api apps so it wasn't a reason for me to stay with New Relic.

Cyranix 11 years ago

Kudos to Datadog. I'm not a current customer, but when I took it for a trial run several months ago I found it to be visually appealing, reliable, and intuitive. It didn't have some of the advanced features that you would get with Hosted Graphite or Librato at the time, but it was a very strong contender. All of us will benefit from the high-quality competition occurring in this space!

boundlessdreamz 11 years ago

What features of librato did you not find in Datadog? I'm a current librato customer and I from what I see datadog is better in most aspects. Also the datadog integrations are awesome. The collecd integration is just ok

corford 11 years ago

We're using Datadog to monitor job stats on a busy RabbitMQ host and it's fantastic.

javery 11 years ago

We routinely pummel Datadog with massive amounts of data from hundreds of hosts and it does an amazing job of showing us the data we need and alerting us when something is wrong. Highly recommended.

sciurus 11 years ago

Datadog is very nice. Here's something I wrote when asked what value we were getting from it- https://gist.github.com/sciurus/3a1cd4c203891c8d33b2

# Why datadog? #

I would break it down into four pieces. Datadog is

1. providing functionality

1. we need

1. in an easy-to-use manner

1. that would be difficult to build and maintain ourselves

# 1) Functionality #

## The agent ##

It gathers system metrics, integrates with key software we use, and provides a standard interface to which our applications can send custom metrics.

## Integrations ##

Datadog has prebuilt integrations to pull data from almost every important service we use.

## Events ##

Through the integrations datadog generates a consolidated event stream that we can filter and earch as needed.

## Dashboards ##

Datadog lets us build dashboards that combine metrics from many different sources. We can combine and transform metrics to make them more useful. It also provides an powerful interface for interactive exploration of metrics.

## Alerting ##

Datadog has nice stream processing capabilities for generating alerts, and it can surface them in services we use like pagerduty and slack.

# 2) Need #

## The Agent ##

We don't get nearly enough insight from cloudwatch alone, we need an on-instance tool to gather system and app metrics.

## Integrations ##

There are lots of services with operational signficiance, but many of them don't provide a good way to access their data.

## Events ##

We would spend dramatically longer investigating problems if we had to look at eash source of events in isolation. Many of our event sources don't even provide a way for us to view past events or to query them.

## Dashboards ##

Per-service and per-instance dashboards are important for investigating problems quickly. The consolidation of data from multiple sources is again a key feature.

## Alerting ##

We need to do anaylze trends in our metrics and alert on them.

# 3) Ease of use #

## The agent ##

The agent is deployable via a chef cookbook datadog wrote for us. It requires minimal configuration. It knows which system and application metrics are worth gathering.

## Integrations ##

Integrating with all the data sources is literally a few clicks.

## Events ##

The interface makes searching and filtering events straightforward.

## Dashboards ##

There are prebuilt dashbaords for lots of things we care about. Snazzy features like autocomplete and templating make building our own dashboards easy.

## Alerting ##

The guided steps and previewed outputs make creating alerts simple.

# 4) Hard to replicate #

Here I described a system of collectd, custom code to pull metrics from cloudwatch, custom code to pull or receive events from various sources (airbrake, cloudtrail, chef, pagerduty, jenkins, etc) influxdb, and grafana.

mrdude42 11 years ago

Very informative post!

Gigablah 11 years ago

Something worth noting: Datadog is currently the only monitoring service that provides a Dockerized agent (so you can easily stick it into a CoreOS cluster, for example).

Settings

Datadog raises $31M for cloud monitoring

Keyboard Shortcuts