Security for Elasticsearch is now free
elastic.coFree, not open source version. Obviously a reaction to Amazon's fork- not wanting to give them any code to pull into their version.
It will be interesting to see if this is enough to retain the majority of the userbase or if we'll still see a majority migrate to the 'Open Distro' fork.
Open distro is definitely creating some pressure on ES. However, there are some misconceptions on what that is.
1) Amazon did not actually fork elasticsearch or maintains any patches against it.
2) Elasticsearch does in fact provide completely OSS distributions and docker images for their products.
3) Amazon has created several OSS plugins for Elasticsearch that they bundle with their open distro that compete directly with what Elasticsearch does in their non OSS add-ons to their product.
So, obviously Elasticsearch is responding to Amazon by ensuring there's little functional gap with the stuff you get for free.
I'd argue most new users are still better off on elastic cloud vs amazon's hosted version of their distro and should not be attempting to run this themselves. I've used both and would pick elastic cloud every time for the simple reason of being more reliable and easy to deal with (e.g. backups, upgrades, cluster topology changes, etc.). Also, it seems they are quite competitive on price/performance.
For reference, we pay about 170 Euro a month for a simple setup that takes care of all our logging (couple of GB worth of logs / day). I'd hate running blind without that. IMHO at those prices, self hosting is not worth the effort (devops time required to do it would pay for several years of hosting).
Regarding the docker images elastic provides, I find it odd that they are only hosted from their own servers (rather than docker hub) and I have looked all over the place to find the dockerfile they use to create those images.
It seems that they are hiding that info and it really locks you into only doing customizations that the docker image is directly built for.
I also don't like to pull images blindly. I generally fork the dockerfile source so that I can build the software from source and have a bit more control and knowledge of what I am installing.
After 1 google search: https://www.elastic.co/guide/en/elasticsearch/reference/curr...
It includes a deep link straight into their repository right at the top: https://github.com/elastic/elasticsearch/tree/7.1/distributi...
In short, their Docker build process is part of the elasticsearch repository. So, it's actually part of their normal build process and not something that happens with a separate build in some different repository. Personally, I think this is a good practice.
That's for both the OSS and non OSS images. They produce these with every build. And they probably test them too, which I think is the responsible thing to do and something I'd expect from them.
They use gradle to build their docker images from a Docker file (https://github.com/elastic/elasticsearch/blob/7.1/distributi...) that essentially untars the tar ball produced by their build. This looks pretty straightforward and free of magic steps.
So, read the source. It's all there. You can build from source or do your own thing. A variant of their Docker file where you just wget their tar ball shouldn't be that hard to do.
Dockerfiles from v6.6 onwards are available at https://github.com/elastic/dockerfiles
The features of the open distro are not enough to compare to the Elastic offerings. I think most are smart enough to see through Amazon's "generosity" and know that they (Amazon) are not a bastion of OSS.
Could you expand on the features that Open Distro misses and that Elastic offers?
You can see the OSS and Free offerings at https://www.elastic.co/subscriptions. Real question is what are the Open Distro is offering:
* Alerting - you can use ElasticAlert
* Security - Search Guard
* SQL - https://github.com/NLPchina/elasticsearch-sql
There very little reason (IMO) for users to a choose "Open Distro" except that it comes as an AWS image.
Maybe a real reason is that it's precisely a distro which means it contains features and reasonable defaults so that users don't need to learn about them, install them, and configure them.
Basically same debate as between Linux From Scratch and a full-featured distro such as Ubuntu.
Also note that you'll want to go to the features page and hit the disclosure triangle on the 'Security' feature. This is very much a subset of their security features- no IP filtering, AD/LDAP integration, SAML or many other security oriented features.
I'm actually okay with these more enterprise features still being premium. Basic RBAC, TLS support and user management should've been core from the start for free, though.
Good catch and an important distinction.
Some background here:
https://devclass.com/2019/03/12/aws-launches-elasticsearch-d...
To be fair, the Docker Hub stats for the Open Distro for ES image don't show a very large shift away the Elastic ES image. And I doubt Amazon really cares about that, either. This is about whether their hosted ES service remains competitive with Elastic's hosted ES service.
Totally agree. Amazon had to do something after Elastic changed their license terms specifically to stop Amazon from competing with their own hosted service.
I'm interested to see what Elastics next move will be.
grabs popcorn
Do you have a good understanding of the legal differences? My understanding is there now an open source license that anyone can fork, including companies basically committing IP theft, and then a basic license that is free unless you are selling Elastic as a service? The idea is that Elastic would put all improvements into Elastic Basic, and Amazon can't use this source code in its forked version?
IMO, it doesn't seem like a near-term risk, but could Elastic ever change its basic license so it costs money for everyone?
I was under the impression that they already opened the code for X-Pack features.
Were these features licensed in such a way that you could freely use them, though? Or was it 'open' as in "you can see the code" but it's not FOSS?
x-pack is not FOSS which is why amazon can’t use it.
As I thought - thanks for confirming.
Refer this thread from 2018-02 when the X-Pack licensing change was announced:
https://news.ycombinator.com/item?id=16487440
Summary -- vague use of the word 'open' and exclusive use of free in the beer sense, led to some significant angst about what this means for end users.
I know it's hard to make a buck with an open source business model but deciding to charge more for security related features is always so frustrating to me. It leads to a culture of insecure deployments in environments when the business is trying to save money. Differentiate on storage or number of cores or something, anything but auth/security. I'm glad they've finally reversed this.
This (while perhaps not perfect) is massive for us, it’s going to be especially useful for Kibana authentication to add readonly and write users, something we’ve wanted for a long time but haven’t been able to afford as a non-profit, charitable organisation.
I know it’s not all 100% open source, but it’s better than a nginx reverse proxy hack or similar.
Thank you Elastic for continuing to create fantastic software.
Did you apply to this program? https://www.elastic.co/elastic-search-awards/ That could help your hopefully.
Running Elasticsearch on K8s storing 16TB of compressed logs across 6 data nodes and ~4600 shards.
We're a really happy ES customer. We've on ESv6 at the moment and it's been running amazingly for us. We've halved our storage and running costs by moving from 5 to 6.
We've always been a licensed customer and they are in front of AWS with their features (we run our k8s stack on AWS though :) )
Some free advice: reduce the number of shards! Each shard comes with some state management overhead.
The soft limit is currently at 1000 shards per node, but you should be aiming at 25-50GB of data per shard.
How did ES 6 reduce your operation costs by 50%? Same CPU, disk & network bandwidth?
That's an insane number of shards, you should be closer to 500 for that amount of data and only 6 nodes.
It's due to the number of indicies/indexes stored from our various data sources. Yours and another poster's comments are interesting so we might look at ways we can reduce the shard count given the new info on overhead.
Interesting. Three hours ago someone in our Ops team shared a link to "Open Distro for Elasticsearch" [1] and it's also featured on the AWS console login page.
Is this a very rushed reaction to it? Or is this related? I would really love to have a clarification of what's happening in that space.
How were you able to miss all the drama about AWS and ElasticSearch / Mongo / Nginx here on HN?
It turned out that with the open core and premium-service models, the original company might not be the only one providing paid services or development. Which was a bit of a surprise to those original devs.
Opendistro was announced by AWS a few weeks ago. It’s their fork of ES with security features and some of the XPack functionality included.
see https://grafana.com/blog/2019/03/28/everything-you-need-to-k... and the first part of the story it links to.
opendistro has this:
https://github.com/opendistro-for-elasticsearch/security
which has feature parity with the free version elastic just released afaict.
No it doesn't - for example LDAP/AD are paid feature in ES
Too little too late? Trying to charge for TLS was a very poor move and it's made me not trust ElasticSearch...
I can understand they need to make money, but still a bit shady. Honestly though it's not something critical for my needs. Now if they could lower the resource hogging a bit, that would interest me... Maybe even pay for that.
I use ELK for Kubernetes and network device logs, and I'm very much with you -- full text search is great, but it sure can be slow, even when running on $1000/month of AWS hardware.
The conclusion that I've reached is that the whole lucene model for logs is kind of outdated. Why am I tuning Java GC params to run "grep foo /logs". I think computers today can do fine with sharded flat files, a minimal index ("which node contains logs from pod foo-2387438-2384738 at 12:34AM"), and then just scale horizontally over (log messages, searches).
I hope my friends over at Tailscale are doing that and I can just move off ES entirely ;)
I believe Loki [1] is intended to basically run "grep foo" at scale (plus some extra niceties like labels). I haven't used it, but it seems interesting.
ELK stack user here - we actually found logstash to be our bottleneck. Changing it out for fluentd fixed our woes.
Same here, fluentd is much better, performance wise.
But then I had to give ES more RAM because it couldn't take the hammering.
In fact, increasing the throughput to ES was causing some pretty spectacular crashes, with the /var/log partition at 100% because of the verbosity of the dumps.
Logstash sucks from both operational and developing perspective. I replaced it everywhere I could by sending structured logs directly from the app or by using newer integrated beats features.
Is Tailscale building a logging product?
It is too late. I've stuck with Solr and things are good.
Why don’t you want to pay for a feature that you need? The company that pays your wages makes money from selling something. Of course you sell what people need
>Why don’t you want to pay for a feature that you need?
It's irresponsible to charge for features such as transport security, in my opinion.
Want to charge for enterprise auditing, federation, reporting and granular access control? Fine, go right ahead. But withholding basic security features like transport security and basic access control that should be core leaves a bad taste in my mouth.
How many unsecured Elasticsearch servers have been popped, leading to data breaches as a direct result of this decision?
That is manifestly unfair, the situation is someone doesn’t want to pay for a security feature so they go ahead and expose themselves, all the time they are trying to make money by using a free product.
Really unfair to point fingers at ES. And I really don’t get why People feel they should be making money off someone’s work but don’t have to pay them. What significant os or free is your company offering
Really and genuinely confusing, does the same approach work with your lawyer, mechanic plumber electricity gas company. They do something for free you demand more for free otherwise you are at risk.
Mechanic: I’ll do the oil filter for free
You: you must also replace the brake pads for free otherwise the car isn’t safe for me
Mechanic: Go ...
A service like this is more like someone giving you a ride.
And if they don't have brake pads, and negligently get into a horrible wreck, one where they walk away unharmed while you are injured? You probably have a case there.
When airbags first came out, only expensive cars had them. I wouldn't be surprised if side airbags are still only found in nicer cars.
This seems entirely different though. It's more like hitchhiking. When you pay for an Uber or Lyft, there's a level of safety expectations in the car. When you pay for a black car, there's a higher level of expectations. When you don't pay anything, you are using it at your own peril. Now, this could be a bad business model or poor mousetrap for adoption. I'm not arguing with that.
A case ?
You want to sue an open source sw maker for not providing a feature for free because when you expose your ES to an insecure network without that feature you put yourself at risk ?
How about you don’t put your ES in an insecure network without buying the feature or pay someone to write the feature for you.
Your analogy is misleading and wrong, my mechanic one is better
How about this I offer you a stranger a ride to a location convenient to me for free, you take the ride then demand I drop you off at another location otherwise you will run in the middle of the road and hurt yourself
Data security isn't as serious of an issue as loss of limb, so there wouldn't be any legal wrongdoing in normal circumstances.
And no, your analogy doesn't make any sense. You keep talking about doing one thing for free, and refusing to do an entirely separate thing. That's very different from doing a thing for free but in a dangerous way.
And I'm not saying anything should be free anyway. Just that if you offer a service, don't make it pointlessly dangerous as an upsell tactic.
> That is manifestly unfair, the situation is someone doesn’t want to pay for a security feature so they go ahead and expose themselves, all the time they are trying to make money by using a free product.
>Really unfair to point fingers at ES. And I really don’t get why People feel they should be making money off someone’s work but don’t have to pay them. What significant os or free is your company offering
Very much disagree with all of this - not an unfair position to take at all. My open source browser supports TLS. The open source web frameworks I work with include built-in web servers that support TLS. It's inexcusable not to support basic things like this in 2019. I don't care if your software is OSS or not.
I'm unsure why "my company" is relevant here. But for what it's worth, the client I currently work with is a) an exempt educational charity, b) open sources all of their internal web applications that interact with the ELK stack.
>They do something for free you demand more for free otherwise you are at risk.
Do you honestly think Elastic would've accepted a PR that added transport security into the open source codebase? Even if it was developed entirely by someone else in good faith?
The only reason they've done anything now is because their hand was forced by Amazon. Honestly? Good. This is about as bad as when StartCom were charging for certificate revocations.
>does the same approach work with your lawyer, mechanic plumber electricity gas company
It's like a lawyer offering to represent me pro bono, and then it turning out that they're not even qualified to practice law and have jeopardised my case as a result.
Legally, sure? There's no warranty given with the software. But it's still a morally wrong thing to do.
The last time i talked to Elasticsearch about pricing, it was so extremely expensive for our use case to the point of it basically being a non valid option for us.
I think what most people miss for these and similar services is you’re paying for really good, on call, white glove Elastic support. In my experience they can often go as for as to replace having a search specific ops team. The cloud hosting isn’t really where the value is.
I guess my issue is that we didn't want or need support. We just wanted x-pack features such as Auth and the Alerting plugins.
We were already hosting it fine ourselves on AWS, as we had devops people very familiar with ES. However the price they quoted us per year was insane for our cluster size for ~20 nodes.
I looked at the price of a Tesla the other day and thought it was too expensive.
doesn’t really say much, I could be in a bad financial position, Tesla could be expensive or I have a different preference to spending my money or I don’t love the environment enough ;)
Same feeling...
> The company that pays your wages makes money from selling something.
They could be using Elasticsearch for a side project. IMO, open source projects should not be unsecured.
Security shouldn't be treated as a bonus feature.
Roles, okay. Not TLS.
Security should almost always be a baseline requirement before something goes up for public sale.
Imagine if Facebook charged you $5 to reset your password.
TLS isn't like say, LDAP integration. One of those is a fancy enterprise feature you can totally charge for (and probably should), and one of those is a basic critical feature.
It would be unethical to charge $400/year to properly store user passwords as hashed instead of plaintext, wouldn't it?
There's also a lesser known project out there: https://search-guard.com/
Paired with an OpenResty reverse proxy I was able to set up a reasonably secure cluster back when X-Pack was prohibitively expensive and the AWS offering wasn't under their BAA.
Big thanks to that team of contributors!
Some of the worst breaches of 2017-19 have been due to open ES clusters, some on AWS. This is a welcome change. I just spun our AWS ES cluster down in favor of BigQuery, but while I was setting it up security for it was a big chore, with defaults that are in no way sane. AWS EC2 does a great job at secure defaults for auth and firewalls, RDS even moreso. Why was ES left to wag in the wind out of the box?