AWS drops price for managed Prometheus service by 84%
aws.amazon.comPrometheus clusters/scaling/federation can be challenging to manage at scale, but Im a but surprised at how relatively popular Datadog is given its price and how feature rich prometheus is. Does anyone here have experience with both?
Prometheus strikes me as one of the most amazing recent pieces of open source tech, the query language in particular (while hard to grok for beginners) was such an eye opener early in my career. It has its warts but infinite kudos to the creators.
Having used Monarch at Google (basically an explicit relational algebra for a query language), the fixed-function query "language" for DataDog is truly sad. It seems that prometheus made similar mistakes to borgmon in terms of designing a quirky query language, but at least it has one.
For an example of annoying behavior in datadog:
* Zoom in on any chart with default_zero. Eventually the spaces between the points will go to zero. You have to add a rollup whenever you use default_zero to avoid this.
* In general, you have very little control over interpolation, which is a very important part of aggregating and graphing.
* exclude_null will remove any series with any null label. You can't write arbitrary filters on series labels.
* Label names and values have to be the same to match. E.g. you can't join 2 metrics where different labels have the same value.
* The GUI doesn't support a lot of the features, e.g. timeshift.
* The available functions often can't be composed. E.g. you can't min by X and then sum.
I have a ton of complaints about the GUI too (e.g. it's broken by typing too fast), but I'll save those for later.
I'm the worst kind of person to reply, because my brain was already damaged by Borgmon before Prometheus was a thing. But I've run both at relative scale, and I'll take prometheus anyday. Datadogs query language is lacking and they've never added decent support for ratio-of-rate metrics, which are most of the metric queries you should be alerting on. Datadog is expensive, and because it's expensive you'll hold back on some of your most important and expressive metrics. Prometheus scales very far vertically - Far enough that I've yet to use it's sharding features, despite abusing tags badly.
Prometheus's query language is a sharp tool, and you can mess it up. It takes getting used to, and Grafana is a must for making charts navigable to executives (which is one of the top selling points of Datadog: Execs can see and build dashboards for the metrics they want). I would not be likely switch to another system - Even Monarch, Google's next generation monitoring system, has significant downsides compared to what you can do with Prometheus.
I have really wanted to get a Prometheus into production for several years and have tried a few times, but working at small startups without dedicated ops people I‘ve found myself mucking around too much trying to get the metrics that I really care about wired up, whereas Datadog “just works”. Datadog’s pricing gives me a bit of indigestion but it’s pretty slick.
Yeah used both in prod and was pondering the same question. On top of that dd was ridiculously expensive and resource hungry in kubernetes environment. I’m biased though bc i used the prometheus inspiration extensively at google. For someone who doesn’t use as many custom metrics (which I suspect are most orgs out there) dd is probably superior bc it gets the job done (mostly) and has ok ui
The GitHub survey showed that datadog is eating Prometheus market share very quickly.
Considering that one is free and that the other is not cheap, I'd like to see some data demonstrating this since common sense would dictate otherwise.
Nothing is actually free
And the customers for this are businesses. Businesses spend money. (as opposed to what a lot of developers think)
You can get into a lot of financial trouble with DD custom metrics, very quickly. We're talking easy five figure bills in a few days. I love datadog and appreciate how quickly their team will teach out and tell you you have a runaway bill but there's not a world in which I think the custom metrics pricing makes it usable. They know it too and they have customers that do clever thinks to use them in bursts. Prometheus is good and with AWS driving the price down, I'll be poking it tomorrow.
Links?
because of the Grafana licensing restriction?
Some Product Managers must be having a really bad day.
AWS is usually super secret about the profit margin on their various cloud services, but now we know that the margin on Prometheus was at least 84%...
Not necessarily. They might be doing this to get more customers on it since they're losing to Datadog.
Amazon has a long history of being willing to eat a ton of loss to kill competition like in the very popular diapers.com business.
They have me interested. Datadog does a far better job with logs but their CM product is wildly expensive. Happy to let someone else run Prometheus for me.
I used to work at AWS and I have no first hand knowledge in this case, but would guess it's far more likely that the engineering teams figured out a way to run it cheaper than at launch, and passed that savings along to customers.
doesn't have to be, they could sustain losses and count it as a marketing expense.
How is this even legal? Afaik in EU this kind of business practices are illegal.
American and European competition laws are very different.
In the United States - the thing that gets close attention is any attempt to coordinate to increase prices, even if that results in increased competition.
Decreasing prices to reduce competition gets a pass over here.