DeWitt Clause, or can you benchmark %database% and get away with it
cube.devYou can add Splunk to the list of companies with a similar clause. As a Splunk competitor it makes sales a bit harder initially (we can show our product's numbers, but nothing to compare them against), but if you can convince customers to set up a head-to-head proof-of-concept of their own, well, they tend to figure out why Splunk doesn't want you publishing benchmarks...
How do potential customers react to "These are our numbers. We would compare them with Splunk's, but their license forbids publishing benchmark results"?
Can you give your name ? I'm fighting against our Splunk decommission project because the big boys tell us we can just use ELK, to which we reply that it means months of devs to reproduce Splunk abilities, to which they reply human cost is invisible but license cost is a sore point for the board...
I love Splunk, it works so well after data is ingested so... who are you if you're better?
It's Gravwell (gravwell.io). Depending on exactly what you're doing in Splunk, it could be a pretty easy transition, and we've even started writing tools to migrate data out of Splunk and into Gravwell. We've got a free 14GB/day community license if you'd like to play with it on your own, or you can email sales for a POC.
Heh
The only number you need to compete with Splunk is a smaller price tag. I'd bet a lot of people would switch solutions and not look back
Can one provide a benchmarking suite that anyone can execute without posting the results of the test? Thus allowing others to run the test themselves easily but not putting you on the hook for the result?
As I understand it you can give instructions and probably even build tools to do the benchmarking, it's just that when the Splunk customer runs those tools, they are forbidden to share the results with you or anyone else.
One of my personal bugbears is the DeWitt Clause for Datomic, especially because knowing the performance profile of Datomic is very important for understanding whether your app will be a good fit for it given some of its peculiarities.
You're free to benchmark it yourself and not publish the results.
The performance of it depends heavily on a variety of factors which may it or may not apply to you.
Like a lot of software, the devil is in the details.
Has either the Dewitt clause or the Dewitt Embrace ever resulted in some kind of legal action?
It seems like more of a threat stance to various partners and ecosystem players than anything else.
It does result in cease and desist threats quite often. We have been on the receiving end of one.
Oh! Would love to learn more :—)
As I'm the author of the blog post in question, I can think of including your account there, if you'd like to.
I think someone from Oracle would be more informed on that matter. JK. On a more serious note, who would dare to displease a multibillion corp with hundreds of lawyers (without being backed by a similar co & lawyers)?
These things become substantially easier when approached correctly.
In this case, never run Oracle software. Not only will it vastly improve your mood during budget season; your developers will be less likely to stab you in your sleep and you will never worry about their primary line of business: lawsuits.
And you don't care how they benchmark.
Here the Jal clause is born: You are now allowed to speak about what Oracle’s main line of business is.
Could you pirate the database, then hide behind the fifth amendment to not reveal that you're a pirate while simultaneously asserting that you never agreed to any EULA? I'm not sure what the legal rights are here.
I'm certain someone in say, China or Russia, could pirate the database and run benchmarks on it with no repercussions. Surprising that this isn't a business model for an overseas technology analyst firm.
> Surprising that this isn't a business model for an overseas technology analyst firm.
How much are you willing to pay for a legally dubious benchmark?
Does anyone ever pay for benchmarks?
Or are they web content used to lure in new contracts?
It's much simpler than that.
Person A installs database on a shared or to-be-sold computer, requires a license for the installation process to make a copy, "agrees" to EULA.
Person B then runs benchmarks on said computer, which does not require a license because no copy is being made, and publishes the results.
The only flaw in this is that Oracle will send its mafia enforcers to break your kneecaps despite not having a valid legal case. So you'll lose even if you technically can win.
The Fifth only protects the innocents. It’s a fun twist of this amendment - if you are guilty you do not have a right to keep silent.
That's true if you've been convicted and sentenced for the crime regarding which your testimony would self-incriminate, but not otherwise. Someone who has committed the crime but hasn't yet been convicted and sentenced still falls under its protection, assuming there isn't a grant of immunity from prosecution to force the testimony anyway.
Other way around, sadly:
https://en.m.wikipedia.org/wiki/Haynes_v._United_States
Convicted felons are exempt from the portion of the National Firearms Act that requires that machine guns (and other NFA items like short barreled shotguns) be registered as it would violate their 5th Amendment rights.
Huh, that is one of the most interesting supreme court decisions I think I've read. I kind of agree with it in a text-of-the-law sense.
I would be very curious to see that logic hold up in court these days. They got Al Capone on tax evasion for instance, but wouldn't paying his taxes on ill-got funds have been incriminating?
and in any sane legal system you are innocent until proven guilty.
> Oracle also inserted a clause in their terms of use that boiled down to the fact that one can’t publish benchmarks without getting an explicit approval from Oracle.
This feels horrible and would make me look away from any software that has such a clause. Then again, i use very little proprietary software in place and when i don't, it's mostly due to someone else choosing it for a project and me just needing to bite the bullet.
Though in regards to databases, i'm not sure why you'd fork over the cash and use something proprietary, unless you're trying to get rid of any sort of liability on your own end. Then again, i'm pretty sure that you could also find someone to offer support for your PostgreSQL or MySQL/MariaDB deployment, if you wanted to waste money (or did anything so interesting where such support would be warranted).
> Some cloud vendors permit you to benchmark their service but require reciprocity: you must make the benchmark reproducible and allow benchmarking of your own service or tool in response.
This is a bit better in comparison.
Though licenses in general puzzle me. For example, MongoDB is licensed under SSPL so anyone who offers it as a cloud service would have to open source their entire infrastructure: https://www.mongodb.com/licensing/server-side-public-license
And yet i don't think that Digital Ocean is: https://www.digitalocean.com/products/managed-databases-mong... (or maybe they offer the older non-SSPL version).
The whole enforcement angle feels like it would probably impact an individual who benchmarks databases instead of reading bunches of legalese more, for example, than it would impact a larger company that could "figure things out".
That looks like an official DO partnership
Having only cursory experience with Oracle databases (as in install and run some queries and that's it), is there any advantage to them over MariaDB or PostgreSQL? Better development experience, easier to tune or no tuning necessary, anything that makes it worth over the free database servers?
As someone who spent some time in Oracle land: MariaDB can't hold a candle to it, but postgresql comes close.
Some things better in Oracle vs Postgres (and I might be dated on my postgres knowledge):. The active/active failover story of Oracle is better with RAC. Auto vacuum horror stories don't exist in Oracle.
Also pro oracle: The 'enterprise' ecosystem is better. Everything enterprisey integrates with oracle, postgresql is still a toss up.
But at the end of the day, I still vastly prefer postgresql. The endless list of weird idiosyncracies and weird limitations in oracle makes you always feel a bit dirty, compared to the relatively clean syntax of postgresql. In oracle land, it is common to wait 1 major version before using new features, because they are unstable when released.
And dealing with oracle support is hell with an additional bonus of pain. They take months for a simple bug fix. They won't admit a bug exists, then call you at 3AM and give you a patch written 2 years ago.
Oracle licensing is a game for advanced poker players. It will be expensive. Then you negotiate, walk away with a 40% discount, making it more expensive than competitors, and find out later it was still a bad deal. They'll interprete standard words like CPU in a slightly different way in their licenses, and finding out in an audit will cost you a lot. Licensing is a never ending drain on your time, and you will loose their games in the end.
As someone who worked with Oracle DB for a quite some time.
Stay as far away form it as possible. It only exists to milk already 'captured' companies, and all competitive niche advantages it had were slowly taken over by postgres.
About 10 or 15 years ago, OracleDB was years and years ahead of other databases. They had better replication, they had a better query optimizer, they had better storage management. OracleDB was the big thing you wanted in a business for a reason.
However, by now, MariaDB and especially PostgreSQL have caught up so much that this edge is gone and it feels like they are just siphoning money from companies who have invested in their big oracle cluster years ago. I do veto any new oracle-first or oracle-only development at work.
Interesting. As a SaaS vendor, we do not allow performance testing of the production system. Because, you know, just casually saturating production resources can become very iffy for strange and unexpected reasons. And you will always be able to saturate a system, or a subsystem of the subsystem of the system.
However, we have provided bigger customers, or customer willing to pay for it, with performance testing environments. We have, however, usually survived into the curiosity phase - "just how much to I have to throw at this thing to break it?".
Looking at the language, almost all of them allow you to run benchmarks since it's phrased as "you may not publish benchmark results"; it doesn't forbid to actually run them. Never mind that MS-SQL, Oracle, etc. are not SaaS vendors of course.
To be honest, if a cloud vendor has technical problems with someone running a few benchmarks then that would make me very wary of said cloud vendor. What's the difference between a "benchmark" and "using all resources I paid for" anyway?
For a smaller/younger SAAS: If a customer environment is suddenly running at 100% of some resource when it wasn’t before, that’s an important thing to alert on / investigate.
For established players it’s lost in the noise, but if it were me I’d appreciate a heads up for big changes.
Sure, a heads-up is certainly nice, but I don't think that running a (reasonable) set of benchmarks is all that out of the ordinary, or any different from just taxing the service at 100% with some periodic batch job or the like. Paying for it is even stranger IMO.
And for what it's worth, I did actually work for a few small SaaS businesses, but a few reasonable benchmarks wouldn't have been a problem.
Of course, if your benchmarks are going to take 50 hours it's a different story.
Also: I suspect a lot of these database SaaS services are a lot smaller than you'd might think. I know at least one of them is anyway because I worked there (and there's no DeWitt Clause).
These specific customers were not reasonable though. They wanted to specifically know when and how the software breaks, because they had been burned by previous vendors. Eventually sales reacted with a rather frustrated "then pay us 10 engineering days so we can setup a dedicated system for this so you can run your tests" and instead of cancelling the deal, they were like "Ok. Here you go, let's go"
And naturally, they were able to break it eventually, but they could have had all global employees trigger request 10 times per second and the system would've held and it recovered as soon as the load was gone. It was a silly deal, but now it's a great business partner.
An occasional spike probably won't even be noticed. Redlining it when you usually hover much lower, without ceasing, -should- trigger alarms, and get engineers going "WTF is going on?!"
Give the vendor a heads up so the engineers can sleep.
There's a really big difference between "don't performance test on our hardware that you're sharing with other tenants" and "don't performance test on our software no matter whose hardware it's running on".
Our database RonDB, by Mikael Ronstrom ex MySQL, is Dewitt free and we promise to keep it that way. Even though it is now managed DB in the cloud.
If you want to benchmark for internal reasons you don't publish the results and nobody knows. If you want to make a service to the community, run your benchmarks, download Tor and publish the results anonymously. I don't see what the big deal is?
Is this only limited to marketing claims where you post it on your company's website?
The only way I can openly talk about a service's performance is by doing it illegally and you don't find that weird?
How likely is it that one takes an anonymous benchmark published by a noname researcher seriously?
Ah the same level as if it were from an acclaimed researcher: not only did they do the work, they also risked breaking the law to disclose it.
This isn't line the traditional case of no name vs trusted name: due to the law you _must_ br anonymous to post this, so anonymity isn't a red flag, it's the standard.
Extremely.
Why would you assume people generally do background/credential checks on researchers?
Sure its nice to have it on phoronix or something, but its by no means a deal-breaker if it isn't.
Aren't benchmarks supposed to be more or less reproducible? So just publish the results and the data?
100%
Have you met programmers?
Its mainly related to MSSQL and Horracle. Horracle will just use their legal team (which is bigger than their engineers and developers) to bludgeon you over benchmarks.
"This just in Oracle legal team takes down entire Tor Network"
Jokes aside, I'm surprised they're so touchy about these things. They can make plenty of money without it, they can also save plenty of money with less lawyers.
If rationality was always used we wouldn't have had Putin making a gigantic, catastrophic geopolitical mistake. A little common sense goes an awful long way, but only if you choose to use it.
MSSQL is destroying its own market by its pricing (and the licensing thicket, jesus christ, I've been looking at them for 2 days now and... shudder)
Used to just use PostgresSQL on Azure just fine. I am really surprised people use MSSQL or Oracle today honestly. You really don't need to tie yourself down to a proprietary DB anymore.
> I don't see what the big deal is?
The big deal is that it's slanderous.
Oracle isn't making an attestation about performance that the benchmark addresses, the benchmark aims to make statements about Oracle for some other reason, and that's important: Rights to free-speech generally end when they cause harm, and bearing full costs in the defense of false benchmarks is certainly harm.
Look at it this way: The clause aims to prevent purchases (entering into agreement with Oracle) under false pretenses. Oracle sells software to solve business problems, not you-need-a-paper-that-makes-Oracle-look-bad problems, and I think vendors are wise to protect themselves from that.
On the other hand, if you actually bought Oracle to solve a problem, and it didn't do that, you're still free to make those benchmarks and sue the shit out of Oracle with them, and this agreement can't by itself prevent the benchmarks from reaching the public record at that point.
> Is this only limited to marketing claims where you post it on your company's website?
If your company makes X and your company website contains a benchmark saying Oracle is slower than X, you're not just making a statement that you observed Oracle was slower than X, you're also making an attestation that the benchmark is a fair representation of both Oracle and X. And judge and jury are going to be wondering if it's as fair as you say, or if it's unfair as Oracle says.
Now, if you're a university and you don't make X, you might be able to argue that even if it's unfair, it was done in good-faith, and judge and jury may believe that, but Oracle will ask, if you truly believed X was fair, why didn't you get our feedback before publishing? and you'd better have a good answer to that.
On the other hand, if you choose to be anonymous, you may be able to avoid the judge and jury, but the community has to wonder who you are, whether you are motivated by a relationship to a company or product that competes with Oracle, or an impatient researcher who can't meet the standard of professional publishing. The community will wonder, but they have lots of other things to wonder about too, so they probably will not wonder for very long. So what's the point? Techdudes already know what they think of Oracle, and nobody who writes code talking or Oracle thinks that Oracle was chosen for its benchmarks, so who is this anonymous benchmark for?
And here we can see Oracle lawyers in their natural habitat, spewing bullshit rhetoric to protect them against normal usage of their proprietary software.
>The big deal is that it's slanderous.
Then you don't need to forbid benchmarking in the terms, if someone posts a slanderous/libelous benchmark then sue them for that.
Honestly, the presence of that clause screams to me "this app sucks and we'll sue you if you tell anyone how badly". That may not be the case whatsoever, but my first assumption is that they're trying to hide terrible performance.
While I would prefer if this clause was not a thing, I also understand why it exists even for great products.
It is surprisingly difficult to reproduce many workload benchmarks and quite easy to engineer a benchmark that misrepresents real-world database performance. There are tools that exist to generate optimally pathological workloads that target specific database implementations, while looking completely reasonable and innocuous. It doesn't even need to be a bad faith benchmark by a competitor, there is a high probability that the person configuring the environment does not know how to do it correctly and/or optimally.
The DeWitt Clause is a defense against the unfortunate pervasiveness of incompetent and/or bad faith benchmarking. Companies have a well-founded reason to not trust third parties to do a good job of representing the performance of their product.
I understand your point, and that’s a reasonable argument. I do disagree with it, though.
Imagine a hypothetical FooDB by Bar, Inc. If Bar never put that clause in the FooDB license, then I think you’re absolutely right. People would come up with some awful-looking benchmarks that made it look bad. However, what a golden opportunity for Bar! They could step up with some free or steeply discounted consulting to help the benchmarker fix the problem and publish new, good results. They wouldn’t have to do that too many times for word to get around on sites like this: FooDB is nice and fast when you tune it correctly! That would come along with some enormous goodwill, and also the assumption that if your FooDB installation is performing poorly, then it must be your fault because all the benchmarks say it’s really fast for everyone else.
I’m not going to tell Bar what their business model should be. I have my thoughts on it, but it’s their business to run as they see fit. But if I see Bar being open and helpful with a freely-accessible tech blog telling you how to make FooDB stand up and dance, I’ll tend to believe that it’s probably an interesting product to look at. If they guard those secrets behind a wall of lawyers and sue people who speak ill of FooDB, I’ll tend to believe they’ve got something to hide. Either one of those beliefs might be completely wrong, but that’s still how I’m likely to perceive it.
I don't think this is a reasonable argument: bad faith stats and data manipulation exists in literally everything.
We should all be free to express our thoughts and backing data, and participate freely in the marketplace of ideas.
No corporation should be able to put gag orders on people especially when they are biased and have good reasons to want to control the discourse
If a pathological workload can look completely reasonable and innocuous, then what's the difference between a bad-faith benchmark and a user making an honest mistake?
Reproducible benchmarking can also inform the person/team that their configurations are suboptimal. How are you supposed to know how good your "best" is if you aren't allowed to talk about shakedowns?
All the BSL/SSPL ones shouldn't be in an "open source" section. Just change the heading to "source available" or put them with the "vendors".
Author of the blog post in question here. Let me clarify: they shouldn't be there because they're not OSI-approved, right? Just wanna get your point here.
(While I understand that BSL/SSPL lack certain liberties, I deemed it okay to mark them as "open source" for the purposes of this post.)
Not just lack of OSI approval, they're attempting to redefine the long accepted meaning of open source to include their new licenses. They want the goodwill of being "open source" without the obligations. The only sorts of licenses that have consistently been considered open source are either copyleft licenses like the GPL and do whatever the hell you want licenses like MIT and Apache. Do whatever you want... unless you're a big corporation... or unless you're part of a group the authors deem evil/immoral/unethical... etc. is a massive departure from the spirit of the term open source.
Makes sense! Will edit when I'm next to my laptop, I promise.
Thanks!
As you say in that section, "Open source licenses usually grant users permission to use open source software for any purpose." These source-available licenses (SSPL/BSL) are not open source exactly because of that: they place restrictions on users from using the software for any purpose. For example in the case of SSPL, it includes purposefully draconian language in the license that basically means that it can't be used freely to provide a "service". These kinds of "poison pills" obviously stop you from using the software for certain purposes.
> unless you're part of a group the authors deem evil/immoral/unethical.
What parts of the license mention that?
It is most likely in the "Additional Use Grant" which is tricky - because this additional use grant is distinct to each product licensed under the BSL. This additional use grant is also not that easy to find, since some licenses display it prominently, and others hide it under some additional legal fineprint[1,2].
From mariadb site: https://mariadb.com/bsl-faq-adopting/#limits
"Q: What are the usage limitations under BSL?
A: The usage is limited to non-production use, or production use within the limits of the “Additional Use Grant” defined by the vendor using BSL and specific to each BSL product."
[1] obvious Additional Use Grant for Couchbase, included clearly in https://blog.couchbase.com/couchbase-adopts-bsl-license/
[2] it is extremely difficult to find the Additional Use Grant for Mariadb products themselves. For MaxScale, which is their proxy product, it is buried in a file within the source code (which on the surface level might be a good place for it, but it's not very easy to find and I had to go through lots of legal print to get to it) : https://github.com/mariadb-corporation/MaxScale/blob/2.5/LIC... or https://github.com/mariadb-corporation/MaxScale/blob/6.3/LIC... depending on which version you are trying to use, etc.
I wasn't referring to the BSL specifically with that comment, I was referring to recent attempts to call licenses like the Hippocratic License "open source" despite being anything but. The BSL is just a different example of a similar thing.
> they shouldn't be there because they're not OSI-approved, right
No, they shouldn't be there because they aren't open source; OSI approval has nothing to do with it. (Though it's still a good indicator, similar to how FDA approval doesn't change whether/how well a drug works/is safe.)
It would be quite refreshing if we could have a story in which Oracle are the good guys for once.
I'm sure they are at least purchasing some modern-day 'indulgences' by - for instance - donating food to starving north korean elites?
If Oracle ever wants to be the good guys just once, I have an idea for them that's right in their wheelhouse. Step 1: buy grsecurity's kernel hardening patches. Step 2: put said patches in the publicly released UEK source. Step 3: wait for grsecurity to refuse to give them future patches. Step 4: sue grsecurity for imposing further restrictions on the exercise of rights granted by the GPL.
I think the weakness of your model is just because you have the right to distribute a certain patch level that does not mean you automatically have the right to distribute further patches,
Conversely, If the right to distribute is revoked, say a GPL to closed source license change you have still the right to distribute any versions originally distributed under the open license.
A good example off all this is the sordid history of the berkely db.
Exactly right - you can get the GPL source for v1.0 which you've bought and paid for access to the binaries, but there's nothing that says they have to sell you binaries for v1.1, and thus you don't have any right to v1.1's source, despite it being GPL.
Matthew Ruffell was keeping the last open grsecurity patchset alive as part of Dapper Linux, but it's stuck on 4.9 and never managed to get a bootable 4.13 port.
Another good thing Oracle could do, is to release a CDDL update that is GPL-compatible.
They try to be the good guy. Their free tier is quite extensive (24GB of RAM, 4 ARM vCPUs and ±2 AMD cores, a several hundred GB of storage), good enough to run quite a decent personal cluster on, probably to lure in businesses for their AWS-style cloud services which are as ridiculously expensive as their competition.
However, just like AWS, Azure, and GCloud, their admin UI is complicated, slow, frustrating and full of invented acronyms and quirks.
I've read various stories over the years about Oracle extremely aggressively pushing high bills because they think you're using the "free" version of MySQL or VirtualBox in a way that you're supposed to pay for it. I'd be very wary running anything "free" from Oracle (as in: I wouldn't).
I had the same worries, but my Oracle account literally can't access any paid services. In fact, it's so bad that when your trial expires (you get some tryout credit) I couldn't even pay to continue operating if I wanted to, the resources had to be recreated from scratch!
I wasn't sure if you were joking, so I googled "oracle north korea". This is the first result:
https://www.upi.com/Top_News/World-News/2020/10/07/North-Kor...
It's a strange feeling to find myself agreeing with the North Korean government.
Oracle Virtualbox is pretty good, and free for personal use. They also make patches available for their Oracle Linux kernel - the UEK, and in a better format than RedHat.
They are aggressive about the "personal use" thing. I used to work for an ISP. Apparently our customers would download Virtualbox, Oracle would pull the IPs out of their logs, find us, and then email us and ask us to buy a license. (More threatening than asking nicely, IIRC.) We informed them that we're an ISP and those IP addresses are our customers', not our office. And no, we didn't give them the contact information of our customers.
I wonder how many of their sales reps have gone after Spectrum and Comcast since the whole work from home thing started.
Do not make the mistake to anthropomorphize them...
They've done a lot of good stuff with Java.
Could this article comparing DeWitt Clauses be considered benchmarking?
No, it's just an analysis of the licenses, not of the software itself.
Or benchmarketing?