Removed gem breaks Rails ActiveStorage
github.comThe reason this is happening is not obvious without reading https://github.com/minad/mimemagic/issues/97
> I've historically been the maintainer of shared-mime-info for around 15 years, and script/freedesktop.org.xml looks like it's a copy of the database shipped with shared-mime-info, which is released under the GPL, with shared-mime-info's translators work merged in, and the GPL header removed.
> The license that you're shipping mimemagic under (MIT) isn't compatible with shared-mime-info's.
Seems like quite a reasonable request, even if folk don’t like the results.
..and to be clear, I’m quite sure that rolling back to the commit before the license change does exactly nothing to address the issue.
You don’t magically get your MIT license back by forking before the license change was added, that’s not how it works.
If the previous version contains GPL code, it’s GPL. It doesn’t matter if you slap an MIT license file on it, or used it in “good faith” presuming it was MIT license.
One thing I am not sure is why such a radical action was taken so quickly without thinking carefully first? It's not like a lawsuit was threatened or something. The original request in https://github.com/minad/mimemagic/issues/97 that you linked to was very polite and professional.
1) A time extension to remove the GPLed code could be politely requested. I know that the copyright belongs to all contributors but getting on good terms with the maintainer could be a solid first step. I think just opening a PR with that file deleted (and tests failing) could have been interpreted as a willingness to comply with the request in good faith.
2) A request to relicense the XML file in question under LGPL could have been sent to the original project (could be problem without CLAs, but still worth a try). Then the library could have been relicensed under LGPL.
3) Gem users could have been notified. Some prominent people from those projects could have helped with (1) and joined a kind request (2) to the original project.
At least that's how we'd (try to) handle it on our project under Eclipse Foundation (though we used to have a GPL code scanning for releases in the first place until very recently) if such situation arose. Anyway, talking to people first before doing something quickly is often a good idea.
> One thing I am not sure is why such a radical action was taken so quickly without thinking carefully first?
I think this can be answered by considering the following:
> At least that's how we'd (try to) handle it on our project under Eclipse Foundation…
Looking at https://github.com/minad/mimemagic, I did not get the impression that the software was backed by any organization, let alone one on the scale of the Eclipse Foundation. If indeed the software is essentially one person, imagine it from their perspective.
> Anyway, talking to people first before doing something quickly is often a good idea.
Assuming that this was not intentional (see Hanlon's razor), you could quickly have groups like the gpl-violations.org project taking notice, and things snowballing from there. I'm not calling out gpl-violations.org specifically here, instead I'm noting that there are other people who _would_ do something quickly.
Another thing to note is that u/minad is (per their GitHub profile) in Germany. That will also affect their opinion on things related to licensing.
> One thing I am not sure is why such a radical action was taken so quickly without thinking carefully first? It's not like a lawsuit was threatened or something.
Once you've been informed of a violation, you have a legal duty to act, no? Regardless of whether counter-action is immediately threatened. (Not a lawyer, not legal advice)
At the end of the day, it's people involved, and people have the capacity for understanding and empathy.
A safe course of action would be for the maintainer to respond with a message like "thank you for bringing this to my attention. Many products and services depend on this package and would be disrupted by any immediate action. I will bring this to their attention and work with them to remove the dependency as swiftly as possible and then remove all available versions of this package from where they are hosted."
If someone brings lawyers to the table due to lack of immediate action, maybe then we can proceed to a more immediate, if disruptive, course. But no need to rush there if there's no external pressure to act that fast.
I completely agree with you in principle. However, if there are potential damages involved, it's hard to argue that you're not increasing your exposure by delaying or deferring the correction. (Again IANAL and this ain't legal advice.) Lawsuits aren't to be taken on a whim. Even if you ultimately prevail, the affair can change your life, and not for the better. So I can't blame anybody who wants to skip the lawyer and minimize their exposure, even if doing so angers a large number of developers—to whom they have no formal obligation.
The question is, what act do you take? It's possible to negotiate and get a grace period to get into compliance, for instance.
> If the previous version contains GPL code, it’s GPL. It doesn’t matter if you slap an MIT license file on it, or used it in “good faith” presuming it was MIT license.
This depends.
Rails used a gem by a different developer, a gem that had its own MIT license. The Rails project and all others using Rails can not be expected that they ought to have known the license is invalid, so usually the GPL does not count for their usage back then.
You can in general never retroactively change a license, so their usage back then was certainly valid. You can [be forced to] stop using a license and re-license future versions of an artefact, and also possibly have to stop distributing the old versions. But that's on the gem's author, not Rails, and would likely not even impact future usage of the old, already obtained versions.
If the original author wanted to claim damages under GPL from Rails, he would have to do so via the gem's author. And even then: What damages? And would the projects have had to know? None and no is the likely answer, safe juridical incompetence/corruption like in the Oracle-API case.
It would be further be complicated by the file in question being a database file. You typically can not license databases in a meaningful way under GPL. Even if you could, reading a GPL'd database has no chance of carrying GPL code obligations over to the consuming program.
As always with those questions, this might depend on your specific jurisdiction. Also, it means in no way that it is not the ethically right thing to swap the dependency to one that does not have this issue.
PS: Also consider that in most uses of Rails, GPL or MIT does not change much, as accessing a server running GPL software does not trigger GPL's distribution clause (you want the AGPL for that). This already limits the impact here. The Github thread has comments in the direction of all Rails projects having to be open source now if the license changed to GPL. Not only can the license of old versions not change, this is also not the effect GPL would have.
"You can in general never retroactively change a license, so their usage back then was certainly valid."
No, it wasn't. It was reasonable, but not valid.
They were using copyrighted code without permission from the copyright holder, relying on a false claim. The false claim gave them no right to use the copyrighted code, and will not protect them if the copyright holder sues them. However the fact that they were acting in good faith and had no idea is likely to help them when it comes to damages. And furthermore if they got sued, then they would have the right to sue the author of the gem whose false claim got them in trouble.
If the original author wanted to claim damages under GPL from Rails, he would have to do so via the gem's author. And even then: What damages?
I have no idea why you think that the copyright holder would have to go to the gem's author to sue about a copyright violation. Furthermore damages are not the only thing you can sue for. See https://www.lib.purdue.edu/uco/CopyrightBasics/penalties.htm... for a list. That statutory minimum of $200 per infringement can add up really fast when you're generating copies electronically.
I think MIT license only claims the code I wrote is provided under MIT (that's why you also have to include a NOTICES file listing other library licenceses in addition to the LICENSE file). It's not like they put MIT header and their name on that XML file.
> then they would have the right to sue the author of the gem whose false claim got them in trouble
I think this is where a useless all-caps text comes handy:
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND [...] AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Noninfringement is mentioned right there. It literally says that I DO NOT promise you that my code that I license to you under MIT (in good faith, ofc) does not infringe anyone's rights.
You have a point, but it is not as absolute as the license suggests.
The problem is that a license is overridden by local law if there is a conflict. For example suppose that there is a law saying there is an implied warranty that goods sold are yours to sell, and you sold a stolen good "as is". In that case the law wins and the buyer can still sue you for having sold tem stolen goods.
And as https://www.klemchuk.com/legal-insights/warranty-against-inf... explains, a common local law is an implied warranty against infringement on others' intellectual property. Which a copyright violation would qualify as.
As always, I am not a lawyer, and this is not legal advice. If something like this arises in practice, you should consult a lawyer familiar with the laws of the venue that the case will be decided in to find out whether any laws apply, and to what extent the generic liability disclaimer won't actually provide protection.
> They were using copyrighted code without permission from the copyright holder, relying on a false claim.
Again, that does not seem to have been the case here.
> I have no idea why you think that the copyright holder would have to go to the gem's author to sue about a copyright violation.
1. It depends where you are, which jurisdiction gets applied. Might explain the different expectation. 2. It'd be the gem author that created an unlicensed derivative work, not anyone else directly. Have fun claiming damages, copyright infringement or anything for indirect usage in such a good faith situation. I really think that wouldn't fly, but again, might depend where you are.
Again, that does not seem to have been the case here.
Again? Not sure where you said it. But the copyright holders in question are the authors of shared-mime-info, and they certainly never gave permission for their work to be used by Rails in the way that it was.
It depends where you are, which jurisdiction gets applied. Might explain the different expectation.
I'm in the USA. But I'm pretty sure that what I said is generically true.
It'd be the gem author that created an unlicensed derivative work, not anyone else directly.
Copyright is triggered by downloading unlicensed copies. And lots of people other than the gem author did that.
An unlicensed derivative was created by anyone who used Rails and wrote code that did mime detection - for example they were handling uploaded files.
It is an open question whether these cases are worth litigating, and what would be decided in court. They might well decide that there isn't enough creative work in the compilation for the file in question to have copyright protection at all. But in the meantime it would be a generally good idea to treat the issue seriously, and to accept that lots of people are potentially liable here. (Even if, in all probability, none will suffer more than a temporary inconvenience as the dependency is removed.)
> Again? Not sure where you said it
Here, it was in the comment (and not an edit :) ):
> It would be further be complicated by the file in question being a database file. You typically can not license databases in a meaningful way under GPL. Even if you could, reading a GPL'd database has no chance of carrying GPL code obligations over to the consuming program.
But I actually just wrote again because I made that point in another subthread, no criticism implied.
Not so fast in that claim.
First of all the infringing file is https://github.com/minad/mimemagic/blob/master/script/freede.... Sure, it is in XML. But it contains a tremendous amount of free-form text, specific sets of pattern matching rules for the data types, and so on. It is a compilation of sometimes original research on the best ways to detect file types. Ruby has other mime libraries. The reason why this one was chosen is that its detection algorithms make better choices. And the reason that they make better choices is that they copied the decision rules from a GPLed project.
But even if it were a simple compilation, it still is not guaranteed that there is no copyright. See https://en.wikipedia.org/wiki/Copyright_in_compilation for an introductory article on what can and can't be copyrighted about a compilation. And one of the elements that matters is creativity in the selection of the material. A set of rules with a lot of "look for this" while leaving out various reasonable thats that don't work so well shows considerable creativity.
That said, a judge may decide otherwise. You never know until a judge decides. But I would not presume that there is no copyright interest to be had here.
> Rails used a gem by a different developer, a gem that had its own MIT license. The Rails project and all others using Rails can not be expected that they ought to have known the license is invalid, so usually the GPL does not count for their usage back then.
> You can in general never retroactively change a license, so their usage back then was certainly valid.
I would ask a lawyer about that. As it has been explained to me, the original author didn't have the right to distribute it under the MIT license, so they (rails) never had a valid license. It's similar with images, even when you grab it off flickr or another page and it specifies a license you like, that does not mean that whoever posted it there actually had the right to do that, and if they didn't, you can get sued.
You are right, that's a shaky part and where insecurity is coming from - and sure, get a lawyer if you want more certainty. Depends where you are anyway. Better answers to that are already here. Just one thing:
> the original author didn't have the right to distribute it under the MIT license, so they (rails) never had a valid license.
Thing is, if it's really about a databasefile that was not copyrightable the gem author did have the right to distribute it. That's a happy circumstance of this specific case, making all of this less severe either way.
How is one supposed to reasonably know, when downloading a package from a public repository, that the included license is authoritative? Are we supposed to research every package we use, and scour all software in existence to maybe trace back true ownership to someplace else? Seems like an auditing nightmare.
You can't. If you're notified then you need to promptly fix the issue with the complainant. When it comes to being sued for damages you can point at the the fact that there was no reasonable way for you to know that the license you trusted was invalid and at the author who was presumably negligent. If you've cooperated fully and mitigated it quickly that should protect you. Ignorance in this case is an excuse when it is reasonable and defensible ignorance, and not negligence on your part.
> Seems like an auditing nightmare.
Yes and that's why large companies are often extremely reluctant to take in 3rd party code without auditing and estimating the risk.
In fact they even sell insurance for this, and companies that want you to use their software can offer indemnity protection with the same effect.
"What if somebody sues me because my use of your software constitutes a violation of their intellectual property rights?" – "Don't worry, we will protect you. Since you pay so much money and are a valued customer of XYZcorp, we don't want you to worry about such things. You'll be covered by our umbrella policy."
This conversation certainly happens, (although it almost certainly wouldn't have happened between any of "Rails" customers and the Rails core team.)
This has not been my experience. Getting the work done fast is prioritized more highly than the (small) compliance risk. Unless the company wants to pay you to invent a bespoke in-house version of React.
For using open source stuff while working on your machine there are often pre approved licenses. But for production use and even more for software being distributed any serious place I have seen, there is paperwork. (Sometimes of better quality, sometimes more of a rubber stamp process)
I guess this is subjective (though maybe not legally), but this lookup table of extensions to mimetypes doesn't feel like GPL "software". It's just a description of other software's conventions using the GPLed source as a reference: https://github.com/minad/mimemagic/blob/master/lib/mimemagic...
To create a non-GPL version, you would have to do what? Research extensions without letting your eyes see this GPLed list?
> this lookup table of extensions to mimetypes doesn't feel like GPL "software".
Copyright nor the GPL are limited to software, collections of data are copyrightable as well; and thus they can fall under the GPL as well.
> To create a non-GPL version, you would have to do what? Research extensions without letting your eyes see this GPLed list?
Yes.
Data in and of itself cannot be copyrighted under U.S. copyright law: "creative arrangement" of it can be, but given that the data in question was generated from XML file, I don't think you could make a claim that the arrangement was copied.
? I don't follow - it seems pretty clear that this falls under the GPL from my reading of what is copyrightable, ie. they can't copy your compilation of the data.
Your reading of what is copyrightable may not be entirely in accord with the US Supreme Court's, which has ruled that mere compilations of factual data (a telephone directory in the case that set the precedent) are not copyrightable.
https://en.wikipedia.org/wiki/Feist_Publications,_Inc.,_v._R....
If someone is trying to apply the GPL to stuff which isn't legally copyrightable in the first place (as may be the case here), then their copyright isn't enforceable, and neither is the GPL.
There are definitely jurisdictions where this has been found not to be the case. In Australia the protection of databases is incredibly murky[1].
Telstra, the privatized government telecom company, failed to protect the White Pages data from another company that, if I remember correctly, had put it on CD (ah those were the days).
[1]: https://www.mondaq.com/australia/copyright/290668/can-a-data...
> collections of data are copyrightable as well
Collections of data are sometimes copyrightable. Depending on the jurisdiction, it may depend on the details of the collection.
I agree. Lists of facts are not eligible for copyright in the US, most famously phone books. It’s debatable as to whether or not this file is pure facts but I’m having a hard time seeing it as a “creative expression”.
Or use another source that is non-GPL - that's proposed here: https://github.com/rails/rails/issues/41750#issuecomment-805...
In a twist of irony, the software for which the copyright claim breaking rails was made is hosted on the free edition of gitlab, which is based on rails.
And according to the twitter-bio of the individual, who brought this up, he's related to Red Hat, which are also affected [^1].
[^1]https://github.com/RedHatInsights/compliance-backend/pull/79...
> You don’t magically get your MIT license back by forking before the license change was added, that’s not how it works.
And all RoR apps don't magically became open source because one of the depencies got contaminated by GPL. It's up to the courts to decide not parties who don't haven't standing.
> "You don’t magically get your MIT license back by forking before the license change"
Am I understanding this correctly. If for example, 15 years you have an MIT code base with only MIT code. Then yesterday, you add a few lines of GPL code. Then today, you remove 100% of the GPL code you just previously added in order to revert back your codebase to be only MIT code ... it's no longer "MIT"? The GPL has now tainted their entire existing codebase?
No, in this hypothetical case you would be fine. Mimemagic case is more like 15 years ago you had a MIT codebase that contained hidden GPL-derived code. One day you are notified about that violation and relicense whole codebase to GPL, but the code that was already in the repository before has never been truly MIT - it was always violating GPL. To fix this you can either relicense whole codebase to GPL (what mimemagic chose) or remove GPL code from codebase, stop distribution of older versions and continue your project as MIT.
Going back to older version of code does not change anything as now everybody is aware of violation so you don't even have plausible deniability. You would need to fork from commit before adding GPL code, but this is impossible for mimemagic as it contained this code since day one.
No.
This is what I personally dislike about GPL it is a viral license, without any 'cure'. Especially the more aggressive licenses feel like a ransom. Either accept gpl, or spend the rest of time trying to quarantine gpl from your projects.
This is a huge unilateral attempt to make FOSS a certain way. And I don't think this kind of unilateral action does anything but set bad blood.
IANAL but I don't think it works like that. Using GPL code in non-GPL project doesn't mean your code is automatically licensed under the GPL, it just means you're violating the license of the original code. How that shakes out — whether you have to re-license your project or just remove the offending code — ultimately depends on the two parties and the court system.
This is the sort of thing that makes some people really wary of the GPL and other "viral" licenses, and I don't think you can blame them. The "blame" for this falls on someone for throwing in GPL'd code into an MIT project, but the headache drops onto a whole bunch more people down the line. It seems other commenters think this will probably be alright, but I bet this is a lot of corporate type's worst nightmare, that some underling added some segment of GPL code to their product, and now the entire thing is "technically" GPL.
One can only imagine if it was AGPL instead of GPL, and how people would debate if they should send source requests to all the sites running on rails ;-)
> but I bet this is a lot of corporate type's worst nightmare, that some underling added some segment of GPL code to their product, and now the entire thing is "technically" GPL.
IANAL, but I'm pretty sure this is _not_ how it works. Your code doesn't magically "become" licensed under GPL if you use GPL code. Your code is now in _violation_ of the GPL and one way of fixing it is to re-license your code. Another way is to eliminate the dependency.
However, if you decide to re-license to GPL then you may still have to pay damages for the time you were violating GPL.
In practice I can't imagine that a court would make anyone pay anything for this incident.
> Another way is to eliminate the dependency
That'll resolve the violation for future releases. However, all previous releases are still infringing.
For a violating company who really doesn't want to open source their project, their best bet would probably be to (remove the dependency and) pay damages for previous infringement.
You'd hope damages in a case like this would be small given it went unnoticed for so long. Considering the shared-mime-info project itself is not commercial software, there probably wasn't significant damage to the project or the authors.
> you may still have to pay damages
This is probably a first time I've seen damages mentioned in relation to GPL violations. Did anyone try enforce this?
IIRC, there was someone who had written some networking code in Linux and independently started sueing hardware vendors for GPL violations. The Software Freedom Conservancy said that he was doing more harm than good, and said that if someone is violating the GPL, lawsuits should first only require compliance with the GPL, then seek punitive damages if they fail to comply.
There's a site here https://gpl-violations.org/news/ which has some cases where there have been legal actions related to GPL violations.
There is at least one case[0] I can find. Probably it is exceedingly rare simply because companies are much more likely to settle, especially in the cheapest way possible i.e. stop distributing the tainted software.
[0]:https://wiki.fsfe.org/Migrated/GPL%20Enforcement%20Cases#Bus...
To be clear, stop distributing is often "good enough" but technically damages could still be sought for copyright violation.
It's pretty clear that distributing without a license (or in violation of one) is copyright infringement, and that's subject to damages.
However, most non egrigious copyright infringement cases are more about stopping future infringement than damages. So I'd be surprised to see much GPL enforcement with damages.
> and now the entire thing is "technically" GPL.
The "thing" doesn't become GPL, though.
They are in breach of the license, it's a major headache, and re-licensing the thing as GPL may be one way forward.
That's not an automatism, though, and no court would declare the thing GPL.
You may pay hefty "fictitious" licensing fees and (punitive) damages, you may have to stop distributing your thing, but you're not losing control.
> You may pay hefty "fictitious" licensing fees and (punitive) damages,
Except in cases like this you likely won't.
As it's clearly a mistake you clearly fixed asap its unlikely you have to pay more than small punitive damages.
Wrt. license fees and (non punitive damages) it's a bit more tricky but it boils down to the damage done. But as this libraries are only distributed GPL licensed and non essential (can easily be replaced) you will have a hard time to show that any damage was done and that the software can be sold for any non negligent amount of money. And if no damage was done and there is no reasonable case for selling the software i.e. non negligible fictious license cost you can guess how the ruling will end.
If you would have intentionally/knowingly done the violation and/or it being essential non easily replaceable software which saved you a lot of money and/or gave you other benefits things are different.
But this isn't really the case in this case as far as I can tell.
I don’t think this situation is inherently different from buying a proprietary library, and discovering that the vendor stole code from the Windows kernel. Or a musician buying a sample, and discovering it was copied from a Disney movie.
You’re responsible for the stuff you use. You should audit it as well as you can—but realize that crap always happens.
It's a lot less likely that Windows kernel code or Disney music is going to be included by mistake, so your potential exposure is much less. In the case of the Windows kernel, it's a lot less likely that anyone is even going to have it because even the leaks of Windows code are distributed to orders of magnitude fewer people than GPL code.
The Windows kernel was a random example and probably nit the best one. I don’t think it’s so crazy to think an employee at a vendor would copy paste some code they wrote for a previous vendor.
My point generalizes, though.
As a rule, proprietary code isn't distributed widely, so there are few opportunities to include it, and as a rule, the harsher restrictions on distributing it make people less likely to not notice that they're not supposed to distribute it. It's much more likely that incompatibly licensed GPL code would be widely distributed and that it would be included by mistake.
Sure, you described a scenario where this can happen to proprietary code. It's not impossible, just less likely.
That’s fair. I guess I just don’t see this is a failing of the GPL. If I want to share some code so others can read it (for education/interest/research/whatever) I should be able to do so—while reserving all rights to reuse the code if I so choose.
So what is a good license for "everybody can use this 100% free of charge but please don't change one line and call it yours"? What about a company like Amazon copying your codebase, throwing millions at it and then leaving you in the dust?
MIT seems far too permissible now and I'm looking for a default license for my projects.
I have been a big fan of the Mozilla Public License 2.0 [1]. I find it is the best combination of "if you use this and improve or modify, those changes need to go to the original code" while not restricting overall usage.
IMO there really isn't anything you can do to prevent people from making a product out of your work if it is open source, but what you can do is make sure that if someone makes improvements to your work, those improvements need to be publicly available under the MPL2.0 license as well.
This has the effect that if someone wants to make a product by just 'adding one line' that line needs to be published and you could add it upstream, making it publicly available again(thus making it harder to make a product solely from your code).
Isn't that the same as lgpl?
What about a company like Amazon copying your codebase, throwing millions at it and then leaving you in the dust?
Well, they can do that with the GPL thus spawned the AGPL which didn't fix the problem either thus MongoDB and companies licenses.
"everybody can use this 100% free of charge but please don't change one line and call it yours"
Well, the BSD licenses require redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms
I know some GPL advocates tend to feel they can remove the copyright statement, but BSD code is BSD code and requires your copyright statement to be preserved.
Different generations of the BSD and Apache licenses have had attribution clauses to various levels of strictness. (Older ones were more strict). Neither are copyleft (Like GPL/CDDL) forcing future improvements to be open sourced. CDDL may be interesting as a way to preventing improvements to a code base from going "dark" but still being non-copyleft compatible. (Hello Sun/Oracle). Another commenter mentioned MDL which CDDL was based on for Sun's needs, that is also worth looking at.
> This is the sort of thing that makes some people really wary of the GPL and other "viral" licenses
True, though the people most concerned about GPL & related licenses are usually commercial users and commercial licenses that include code access are no less "viral" then the GPL.
Exactly the same thing happens with non open source, proprietary code which leaks into open projects.
> makes some people really wary of the GPL and other "viral" licenses
It's worse than that surely - as in this case avoiding GPL doesn't prevent the problem. This sounds like for a medium-paranoid-legal perspective, that it would "prove" that even non-GPL code isn't safe, thus discouraging from usage of any open source software [edit: dependencies]
> One can only imagine if it was AGPL instead of GPL
Right, that seems like the only saving grace that avoids this being an potentially apocalyptic event.
Why would closed source software be safe? Say I copy shared-mime-info completely, compile it, sell it to you as MimeWizardPRO2000, you include it as part of your closed source web framework and sell that. You're still distributing GPL code without making your source available.
I think it's different if you are re-using source (with GPL notices) or binaries (which don't have them)
> the GPL and other "viral" licenses
I really hope someone writes an article with the title "what color is your license?"
> GPL and other "viral" licenses
“When others hurt me, I try to defend myself. But some tell me that this makes them sick. They tell me that I should permit people to rob me of my work. They tell me that I should never try to defend myself.
They tell me that I should stop using the GNU General Public License, a license that vaccinates me against hurt. Instead, I should adopt a license that permits other people to rob me with impunity. They want me to adopt a license that forbids me from fighting back. They want me to give up my right to benefit from a derivative of my own work, a right I possess under current copyright law.
Of course, the language is a little less feverish than this. Usually, I myself am not called “infectious”. Rather, the legal defense that I use is called “infectious”. The license I choose is called “viral”.
In every day language, words such as “infect” and “virus” describe disease. The rhetoric is metaphorical. A legal tool is not a disease organism; but it is popular to think of the law as an illness, so the metaphor has impact.
The people who want to rob me use language that says I make them sick when I stop them from robbing me. They do not want to draw attention to the so-called “disease” that makes them ill: my health and my rights, and the health and rights of other people. Instead, they choose metaphor to twist people's thinking. They do not want anyone to think that I am a good citizen for stopping crime. They want the metaphor to fool others into thinking that I am a disease agent.
The GNU General Public License protects me. The connotation of “virus” and “infect” is that my choice of defense gives an illness to those who want to rob me. I want freedom from their robbery; but they want the power to hurt me. They get sick when they cannot hurt me.
To use another health and illness-related metaphor, the GNU General Public License vaccinates me; it protects me from theft.
Note that the theft about which I am talking is entirely legal in some situations: if you license your work under a modified BSD license, or a similar license, then others may legally take your work, make fixes or improvements to it, and forbid you from using that code. I personally dislike this arrangement, but it exists.”
— Robert J. Chassell, Viral Code and Vaccination, https://www.gnu.org/philosophy/vaccination.html
1. Is a database like that even copyrightable, especially in the US?
> United States: Uncreative collections of facts are outside of Congressional authority under the Copyright Clause (Article I, § 8, cl. 8) of the United States Constitution, therefore no database right exists in the United States. Originality is the sine qua non of copyright in the United States (see Feist Publications v. Rural Telephone Service). https://en.wikipedia.org/wiki/Database_right#United_States
2. I'm skeptical that using a GPLed database makes this library a derivative work of the GPLed database, though the "distribute as a part of the whole" clause still applies
> These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works
> But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.
> 1. Is a database like that even copyrightable, especially in the US?
Yes, collections of data are very much copyrightable, especially in the US.
This is not just a list of mime-types. It is a list of mime-types and instructions on how to detect those mime-types.
I would have interpreted simple patterns (e.g. value x at offset y) as non copyrightable facts about the file format.
Complex patterns could be problematic though, since you could argue they are original programs.
See the Olson Timezone database[1] as another example of "simple patterns" that are very much copyrightable.
The act of curating a collection of what may be "simple facts" creates a copyrightable work.
A farmer's almanac of seasons and weather patterns is copyrightable, even though the bare facts that it tabulates are not.
[1]:(https://en.wikipedia.org/wiki/Tz_database#2011_lawsuit)
> Olson Timezone database
That lawsuit was dismissed, in fact the article you linked says as much.
I stand corrected -- I was working from memory and didn't spot that development. Thanks!
I'll have to do a bit more digging to see if my original point still holds, even though the example I used to illustrate it doesn't.
E.g. see https://www.dmlp.org/legal-guide/works-not-covered-copyright -
> there may be situations in which a compilation of facts may be protected if the creator of the original publication selected, coordinated, or arranged the facts in an original way. For example, a sports almanac may arrange baseball scores in a creative way, a genealogy chart may arrange birth dates in an original way, or a cookbook may arrange ingredients in a creative and original way as part of its recipes. In each of those instances, the creator of the work would have a copyright in the creative arrangement of the facts, but not the facts themselves.
Though https://www.copyright.gov/circs/circ33.pdf says about recipes,
> the Office cannot register recipes consisting of a set of ingredients and a process for preparing a dish. In contrast, a recipe that creatively explains or depicts how or why to perform a particular activity may be copyrightable. A registration for a recipe may cover the written description or explanation of a process that appears in the work, as well as any photographs or illustrations that are owned by the applicant
So I'm not clear where the boundary actually is on this one.
Is that bit about recipes why so many recipes online come with a story?
I took a closer look at this database and library.
The actual patterns are very simple and standardized:
* The base-case is checking if a certain byte-string can be found within a given offset range * patterns form a tree where at all patterns from the root to one leaf need to match, which amounts to a restricted form of expressing "AND" and "OR" expressions
So it looks like there is very little space for originality in expressing these patterns.
* It doesn't appear to be a curated database, but rather aims for completeness (i.e. the selection or arrangement shouldn't be covered by copyright) * Mime types and extensions are also very simple facts which can't be expressed in an original way * The human friendly format allows a bit more freedom, but is still quite limited
IANAL, but I'd guess this database is not copyrightable in the US, but protected in the EU since it recognizes database rights.
Anytime you publish something, it is copyrighted. The data within may not be, but my presentation of it in a certain database certainly is.
When is my work protected?
Your work is under copyright protection the moment it is created and fixed in a tangible form that it is perceptible either directly or with the aid of a machine or device.
The relevant question is "What does copyright protect?"
> Copyright, a form of intellectual property law, protects original works of authorship including literary, dramatic, musical, and artistic works, such as poetry, novels, movies, songs, computer software, and architecture. Copyright does not protect facts, ideas, systems, or methods of operation, although it may protect the way these things are expressed. See Circular 1, Copyright Basics, section "What Works Are Protected."
You need to argue that such a database is an original work, and not merely an uncreative collections of facts. I would at least consider simple patterns uncreative facts, but complex patterns might be considered copyrightable original works.
Both of your claims are incorrect, at least in the US. A work must pass a certain level of creative expression to be eligible for copyright and collections of plain facts, e.g a phone book are famously not copyrightable.
This was handled quite poorly -- a different course of action could have avoided all the chaos while also resolving the GPL violation.
As the GPL FAQ states:
> If a programming language interpreter has a license that is incompatible with the GPL, can I run GPL-covered programs on it? (#InterpreterIncompat)
> When the interpreter just interprets a language, the answer is yes. The interpreted program, to the interpreter, is just data; the GPL doesn't restrict what tools you process the program with.
In mimemagic's case, similar logic could apply:
* mimemagic could redistribute the GPL licensed freedesktop.org.xml file. This redistributed file would retain the original GPL license and its terms.
* mimemagic could then read the freedesktop.org.xml file at run time and generate whatever data structures it needs. mimemagic would continue to be MIT licensed without violating the GPL license.
The problem is that mimemagic includes Ruby code generated from the GPL licensed XML file, and it could be argued that this makes part of mimemagic a derivative of a GPL licensed work. They just needed to stop doing that.
Of course I can't point this out to the repository owner now that the repo has been archived and thus commenting is now disabled.
That approach is roughly being taken in this fork: https://github.com/jellybob/mimemagic/issues/1
With the difference that the gem will by default download the XML file at runtime, with the option of using a local copy specified by an environment variable. I guess they are operating under the belief that including any GPL file taints the library, or perhaps they're just playing it safe.
Yes, this approach can work, unless the system which Rails/mimemagic is deployed to has restricted network settings. Under a restricted network setting, I believe the only solution which will work is to use a different package offering similar behavior, under a non-GPL license, or to re-implement the existing behavior using the freedesktop.org.xml as an input/output specification, rather than a source for derived code.
Presumably one could download the xml as part of the gem installation process (e.g. using mkmf as if it were a C extension, but there are probably simpler ways), so doing a local install at container build time would store the data with the gems in the Docker image (or local bundle if not using Docker).
But a mime database seems an awful lot like an uncopyrightable list of facts.
Downloading and using the XML file as the source for some transformation would produce what is considered a derivative work, which would thereby be covered under the GPL. By accessing the file contents through an "abstract interface" this limitation could be avoided.
As I understand it, because the XML file contains instructions with regard to how and where to read files for the purpose of discovering their MIME type, those instructions are copyrightable. I could be wrong, as I am not a lawyer.
It's interesting how many commenters on the various issues around the license change seem to think that software licensing is an inconvenience, rather than a serious legal question.
Well, when viewed from the perspective of legal realism, a lot of software licensing is a joke.
I'm an open source developer - but even if Oracle had violated my license terms and I had indisputable proof of it, I wouldn't take them to court.
Arguing about the differences between GPL3 and WTFPL in a hypothetical court case is about as meaningful and productive as arguing about the differences between a chainsaw and a katana in a hypothetical zombie apocalypse.
>I'm an open source developer - but even if Oracle had violated my license terms and I had indisputable proof of it, I wouldn't take them to court.
Why do you use a license with those terms, then?
Court cases over license violations are not hypothetical. Perhaps your stance is that licenses are frivolous, but there are plenty of people in software who don't share it. And those people, given "indisputable proof" of a significant license violation, would happily take you to court (with FSF support) to force compliance, if necessary.
No, but situations where it would make sense for me to pursue a court case over a license violation are hypothetical.
Look at Oracle vs Google - Multibillion dollar companies, getting advice from the absolute top legal experts, yet they still can't agree on what is and isn't allowed by law. And getting an answer for that has taken over a decade and an eyewatering amount of money.
Now imagine I'm a Finnish developer living in South Korea who released code under an American-written license, and a Russian company infringed on it.
It's inconceivable that I'd choose to take huge personal risk and expense, sacrificing years of my free time, pursuing litigation over something I was trying to give away for free anyway.
That's not to say people can't do this stuff if they enjoy it - by all means, collect some katanas if that's your idea of fun!
Why not just use a license that explicitly says you won't use legal recourse, like the Unlicense or 0BSD, ie. release it to the public domain?
If you use MIT even you are saying "I will litigate you if you do not comply with my demand that you include this license when you use this software."
Saying "software licensing is a joke" and "hypothetical zombie apocalypse" may be provocative and/or funny, but it distracts attention from the underlying logic. In my view, when conversations start going down this path, they become less substantive and interesting, because the meaning becomes muddled.
I try to always remember:
* One person writes a comment one time. N people read it. N >> 1. Therefore strive to be clear.
* "Comments should get more thoughtful and substantive, not less, as a topic gets more divisive." https://news.ycombinator.com/newsguidelines.html
> I'm an open source developer - but even if Oracle had violated my license terms and I had indisputable proof of it, I wouldn't take them to court.
that's one of the good reasons to assign copyright to a larger entity (Apache foundation, FSF, or whatever): they'll fight to defend the license when it is violated with means you do not have.
So you're saying the legal system is stacked in favor of the large, so we shouldn't bother with laws at all?
If you broke Oracle's license terms they'll be suing you.
Large organizations often are risk averse with regards to legal matters. They don't want to be sued for misusing a license. The threat of litigation has a real effect, even if one particular individual is unlikely to bring a case.
Note that using a GPL dependency on servers is always allowed: "except executing it on a computer or modifying a private copy." Most Ruby on Rails projects are executed only on own servers. Smells like flamebait.
There are projects that redistribute the source code, such as Gitlab, and for those projects this is a significant problem.
How does yanking work for rubygems?
In Rust a yanked version can still be downloaded when compiling (you have a lock-file referencing it), but isn't chosen when adding it as a new (transitive) dependency to your application. So yanking shouldn't break any existing applications.
(Though since is about a copyright violation, a DMCA notice against the package registry could result in a hard removal, and not just a yanked package)
https://blog.rubygems.org/2015/04/13/permadelete-on-yank.htm...
Summary: Before 2015 then yanking didn't delete anything, but you could contact support to have it removed. They ended up getting tons of support requests and therefore changed it to be permadelete.
A yanked gem won't be downloaded for a `bundle install` or anything of that sort. Aside from a record that it once existed it's basically gone.
I'm kind of surprised that nobody is talking more about this right now.
Everyone with a Gemfile.lock that does a `bundle install` as part of autoscaling (without having vendored gems or a rubygems mirror which doesn't obey yanks) is now broken, potentially in production.
This is true, and important, but:
You should never depend on GitHub or RubyGems for deployments.
If your deployment failed today due to this gem yank, it has exposed a bug in your systems that you should fix.
EDIT: I should not speak in such absolutes. "Never" is a big word and clearly this does not apply in all cases! Depending on third-parties for deployments is a risk -- but might be tolerable, if a multi-hour outage would not be devastating.
I addressed that VERY specifically in my comment:
> (without having vendored gems or a rubygems mirror which doesn't obey yanks)
The problem is that the author of the gem just forced a firedrill down everyone's throats today. Doesn't matter if they wanted to or not.
And in prior incidents admins who have taken the precaution of setting up rubygems mirroring and thought they were being responsible were embarrassed to discover that the gem yank was propagated to their own mirror.
Which is a lack of testing, but again, those deficiencies happen, and this is really forcing a firedrill on everyone, without any notification. And the author who did the yank was likely completely unaware of the blast radius of what that action would entail.
What's the solution? Having a mirror/archive of some kind of the gems I use?
Something along those lines, yes. Mirror/archive/caching obviously requires setup and maintenance.
Vendoring gems works well if you (and coworkers) develop and deploy on the same platform.
A minimal approach might be to keep local copies/clones of all gems. If things blow up, you can always build and vendor any missing dependencies, and then redeploy. You'd need to keep a local environment available that matches your deployment env, for building native gems.
GitHub and RubyGems are very reliable, although of course not 100%. It's more common (but still rare!) that an individial gem owner will do something odd, or remove an artifact. Often, you can wait the issue out, or spend a few hours constructing a workaround.
But sometimes you cannot wait. And sometimes you don't get the chance to decide -- your deployed and running code will suddenly fail because an application in AWS or GCE needs to scale up with new instances, or your existing instances auto-update, or otherwise replace themselves.
If that would be a serious problem, it makes sense to invest time into reducing third-party deployment dependencies.
This just bit me.
The first thing that I noticed was that some people are not understanding the GPL. It's far more impactful to Rails than the vast majority of web applications built using Rails. The use of GPL'd files means that the gem itself has to be released under the GPL. Since the gem is now under the GPL, dependencies are also under the GPL. That would include Rails. However, even if Rails was under the GPL, organizations could still build closed-source web applications using Rails since network access is not distribution. That's the whole point of the AGPL.
However, it does raise a lot of questions about when someone is allowed to yank a gem (or any library, really). It's been a while since I took a deep dive, but I was under the general impression that there was some leeway around not breaking the world when rectifying license issues. I would think that releasing new versions under the correct license and giving everyone notice and time (30 days?) to update would be fine for most copyright holders. I'd suspect that most open source developers wouldn't want to break the world. The sudden yanking with no warning caused builds to fail everywhere.
The absolute worst thing, though, was that changing a license should not be a minor (or a major) version number increase. It should be a patch. The breaking was simply because Rails is pinned to 0.3.x, but the first release under the new license was 0.4.x. Fortunately, the author released a 0.3.6 patch with the correct license, so it's just a matter of a bundle update to get the latest version. But if he hadn't, Rails would have had to release a new version and anyone on legacy/unsupported Rails versions would be hosed if they had to rebuild and redeploy.
This is a really good reason to stand up your own artifact repository and put all of your third-party dependencies in it, especially if you're a business.
> The absolute worst thing, though, was that changing a license should not be a minor (or a major) version number increase.
The license didn't change. It was always already GPL, due to the usage of GPL-licensed code, regardless of what the metadata said. The change just made the metadata correctly reflect reality.
[EDIT: I should clarify that technically mimemagic wasn't already GPL, but the only legal way to use it was by satisfying your obligations under the GPL, making it effectively GPL. The author did relicense his own code to be GPL instead of MIT.]
To me it seems like making your downstreams aware of that ASAP is pretty important, since this has important legal implications for them as well. Yanking the old versions and releasing an update with an incompatible version number is a way to do that, albeit one that's quite disruptive.
Yeah. That's a better way of putting it. The author didn't opt to change the license. He corrected a licensing error.
I do agree that making the downstream users aware is important, I just don't agree that immediately yanking is the right solution. Putting out a new version would have been nice. Adding a post-install message to the new versions would have been good to start to get the word out. Not sure how far to take it, but opening issues with dependencies (RubyGems provides this information) would have also been nice, giving the major dependencies a good notice before yanking.
After the "left-pad" fiasco, and a similar event on the Ruby side, I started vendoring my dependencies as standard practice. I have not been sorry yet, in fact I feel vindicated in that approach.
Whoever has missed the event: https://www.theregister.com/2016/03/23/npm_left_pad_chaos/
Vendoring in ruby land is a double edged sword. It is much safer as you said. However if you _do_ vendor, be sure to be running containerized first. Otherwise you will be in a very frustrating spot of having to handle all sorts of native gem issues when trying to run on various computers during dev/test/prod.
Yes this is a real problem. We primarily use docker which solves the issue, but there are people that hate docker and want to run native. For the mac users that doesn't go too well.
Ive lost countless work days to figuring out gem build issues on mac when everyone else on the team was running on linux/vagrant.
Vendoring is a good first step, too. As long as you have a local copy of all the dependencies, you're better off than needing to go pull them from the Internet every time you want them and risk having them gone. Potentially worse is having the same version but with modifications.
we get a form of this with our two-stage image building process -- the first stage installs all dependencies and we only update it when dependencies change
> The use of GPL'd files means that the gem itself has to be released under the GPL. Since the gem is now under the GPL, dependencies are also under the GPL.
No, that's not true. You can dual-license dependent software under GPL and MIT. The GPL merely requires a license at least as permissive as it.
> The GPL merely requires a license at least as permissive as it.
No, it requires a license that's at least as permissive as it AND that imposes the same obligations (i.e. source distribution, etc.) on the licensee.
Dual-licensing dependent software under the GPL and MIT only ensures that you can rip out the GPL dependency, and then use the (formerly) dependent software under MIT. The whole package is still GPL and imposes the same obligations on derivatives of the package.
Yes, that's what I'm saying.
You can dual-license if you own the full copyright ownership but if you include GPLed stuff (and don't have the full copyright ownership) you'll have to GPL the result.
As for "at least as permissive" - it requires no further restrictions, but it adds a bunch of restrictions itself. And there's no other license that doesn't add restrictions - MIT adds restrictions to reproduce the MIT license, which is an extra restriction. The restrictions are attempted excused by the FSF under the "attribution" clause of the GPL, but it is not clear to me that is valid and it has not tested by any court.
I am fairly sure MIT's license is considered an "appropriate legal notice."
It's like left-pad all over again.
I wonder how much software will be unbuildable in 10 years time, due to dependencies that can no longer be downloaded. Is there an archive.org for packages?
At least this dependency makes sense: mime type parsing is nontrivial and something you'd logically want to leverage a library for. I can't comprehend how somebody could ever have said "I need left padding. I wonder if there's a library for that somewhere?"
This is why I commit vendor directories.
I don't mind if CI ignores it but it's nice to have a fallback that ensures the project is buildable at all times.
I really do wonder about the long term sustainability of package systems. The oldest business software, think COBOL, still works because it can still run the way it did when it was created. Will I be able to say the same about my software in 50+ years?
Always vendor software instead of relying on public repositories.
TL;DR
This unfortunate chain of events is rooted in licensing violation: https://github.com/minad/mimemagic/issues/97
Mimemagic got its MIME tables source generated from `freedesktop.org.xml` file, which is licensed under GPL2, and the resulting source was released under permissive MIT license. All prior 0.3.6 mimemagic versions violated the GPL2 license.
The author of mimemagic couldn't change the pre-0.3.6 versions so they simply deleted them.
Unfortunately "the fix" has broken the dependent projects and such have to either:
1) upgrade to GPL2 compatible mimemagic 0.3.6 or 0.4.0, which conflicts with MIT licensed projects like Rails or
2) build/use other MIME resolving library with has permissive license or
3) fork mimemagic under MIT and implement dynamic loading of `freedesktop.org.xml` which wouldn't violate the license.
Since the xml file is not included in the source, and was just a reference for a rb source file's lookup table, it just feels weird that 3 fixes the violation.
I think the pedantic interpretation of the GPL “depends on” clause is that burning a content-hash of a GPLed release of a work into your work, such that your work retrieves and installs the GPLed work-release by its content-hash (or retrieves the work-release by name + version and then verifies it by content hash — as a Bundler Gemfile.lock does), is “depending on” the GPLed release of the upstream work. Due to the explicitness of the reference, the only release that the downstream project could be depending on, is a GPLed release. (Remember, GPLed code releases need to embed the GPL license somewhere, so there’s no posssibility of a byte-for-byte identical dep being created by coincidence that isn’t GPLed.)
Meanwhile, just saying “I’ll take whatever is in the environment at [path]” is a more plugin-like approach: a GPLed database could be placed there, but a differently-licensed database could be there instead. Because you’re not making any explicit reference to any particular release of any particular work, you aren’t infected by the copyright/licensing of the particular work/release that happens to be there.
It’s a lot like the case-law of the DMCA’s “tool used for breaking copyright” clause: if the tool has features that exclusively help to break copyright, with no other uses, then it’s in violation of the DMCA; while if all features of the tool have other potential use-cases, then it doesn’t.
In both cases, it’s a question of whether there’s a “reasonable doubt” on what exactly the project was aiming to achieve / link against. If the project is explicit and removes all doubt, then it’s in violation.
Then you add a step for adding the content hash as an environment variable in the installation instructions, and include the actual content hash as an _example_ (wink wink) in the documentation.
I don't think Bundler/Bundler-like project lockfiles work that way; lockfiles generally need to be static artifacts checked into source control, so that their transitive dependencies can be resolved and the whole tree of dependencies can be inter-constrained. Swapping the dep out for a different one would require you to re-lock everything.
And, even taking one step back and not having a lockfile, and instead using the dependency version-constraints spec file (the Gemfile) — constraint specs are still generally not really dynamic formats. "Runtime", for them, is compile-time; and usually you can't execute code in them, because the runtime needs to load and resolve them before any of your library's code gets to run. If your app depends on the Rails gem, Rails doesn't get any opportunity during dependency-resolution to run code that decides what its transitive dependencies will be.
(One exception to this general rule in package ecosystems, is Python, due to the existence of setup.py files. This exception is why `pipenv install` in a large Python project takes upwards of 15 minutes: nothing can be parallelized — or ever truly locked down to a specific version — because each dependency gets to run arbitrary, not-guaranteed-deterministic code during installation to decide what its own transitive dependencies will be.)
You could probably create some sort of shim library that dynamically downloads your actual library at runtime — first rewriting its transitive deps, and then loading it as a dep through low-level use of the runtime packaging machinery... but at that point it's a lot easier to just actually load the database itself by dynamic reference.
IANAL but it explicitly states it is derivative from the XML file: https://github.com/minad/mimemagic/blob/master/lib/mimemagic...
Looks like there is a proposal to completely replace the gem: https://github.com/rails/rails/pull/41751/files
Is there any precedent to what happens, or could happen, if a project changes licence like this in a patch release? Is there any provision for mistakes like this in the GPL, or is everything that has ever used this package now considered "fair game" for classing as GPL and making source requests?
(although I imagine rails being a web framework probably protects anything using rails and only serving the end results publicly, this sounds like the sort of nightmare scenario that would make legal departments nervous about open source)
GPL licensing of derived works is not automatic. Instead, distributing under incompatible terms is copyright infringement.
It may be possible to remedy this infringement by releasing the source code under the GPL, but it also may not (e.g. source code contains un-relicenceable code from a third-party), in which case the only remedy is to not distribute the program at all.
Ah, right, so it doesn't make it automatically GPL2 unless they want to continue to distribute it - and presumably only the original GPL2 licence-holder(s) are in the position to raise the issue of past infringement-via-distribution.
And so presumably unless rails was actively distributing bundles with it they'd would not be counted as "distributing" this GPL dependency.
It does sound like exactly the sort of hole that AGPL is designed to close is the saving grace here?
Somewhat, but the thing is even if they sue, as long as the offender (rails? rails user?) does stop distributing the software until they replaced the code in question the amount of damages you can sue for can be surprisingly little.
If it would be a "essential" component like e.g. Linux it's a bit of a different matter. But as long as it's a non essential easy to replace library the cost of suing might noticable outclass the any money you can get out of it.
This is not a given thing, but at least it's not unlikely as far as I can tell.
IMHO using GPL for anything but full blown Applications, System Components or very large/complex/tricky libraries is kinda pointless.
And for them GPL isn't good enough in the current ecosystem, so you might need to go with AGPL or SSPL. Both which are noticeable less liked then GPL by many.
IANAL so take with a grain of salt, but legal action is very rare under the GPL, and it's also expensive. In a case like this also I think it would be a tough case. I wouldn't worry about it, at least not currently.
I expect this whole discussion could have IANAL prepended to it - but it seems like an interesting question - what the licence strictly implies, beyond "It probably doesn't matter as probably nobody will take legal action against you".
Or is lack of enforcement, in the end case, the only thing that matters - making most discussions about open licences and in-depth consideration meaningless?
If rails is now considered as GPL because this dependency, does this mean that GitHub Enterprise is now GPL?
Rails isn't considered GPL because of this dependency. It is in violation of the GPL¹, which is copyright infringement. Releasing the violating software under the GPL is one way to stop that infringement, but that's not an automatic legal mechanism.
If a copyright owner decides to pursue a GPL violation, they could get damages² and enforce that the infringement stops (i.e. cease using the GPL-licensed software). It's incredibly unlikely any judge would force anybody to release source code.
¹ Actually, Rails itself isn't even in violation, because the project satisfies all the obligations the GPL imposes. GitHub would be in violation.
² In this case, where infringement wasn't intentional, they'd probably get almost nothing provided that the defendant stopped infringing when they learned of it.
> rails is now considered as GPL
no
> mean that GitHub Enterprise is now GPL?
even less so
---
Rails was in a license violating situation, which doesn't make it GPL at all.
Then the outcome of a legal case trying to sue someone who is knowingly using rails which unknowingly pulls in a GPL licensed dependency might be less clean cut as you might think.
Lastly depending on the version of GPL and other factors like non-clean cut interpretations you might be able to argue that a company building a service using rails wouldn't need to make the service GPL even if they use GPL software to do so (if that GPL software is in the backend only!, not if it's in the UI). The reason is that the service is not distributed by them, it stays internally even through it is communicating with a website(html,css,js, not! server side rendering) which was distributed to the user.
The reason I mentioned GitHub Enterprise instead of GitHub website is because the former one is not a service. It is a software distributed to the end user.
Based on your comments, it seems that the existing releases of the GitHub Enterprise are in GPL violation states due to the transitive dependency.
I guess, yes.
But then an interesting question is how transitive copyright violations apply (because this is what a GPL violation legally is, you use the license to use it, nothing more and nothing less).
The reason I'm wondering about this is because the situation here is similar to a producer of e.g. cars buying a lets say seat to be put into the car and inside the seat they seat producer used some e.g. screws which violate copyright.
Would it be possible that the car manufacturer is hold responsible for the copyright violation enacted by the seat producer? Unlikely I guess?
Would it still have some consequences? Surely, but likely negligible:
Violating GPL doesn't make any code become GPL (a common misconception) and copyright infringement laws are often based on monetary damage done by the infringement. And lets be honest how much damage is done in case the product is not sold, only given away for free and has competition which is also given away for free with even less constraints?
Can someone please explain how it is possible to license a database of such sort in the first place? Pretty much all file types have some documentation on how to identify them by reading specific bytes, it's not like the folks from freedesktop invented those methods. On top of that, having the DB licensed under GPL would mean that every line of it is also under GPL, thus forcing the same GPL to all libraries out there that do even a simple PNG check using a magical byte check?
I'm really curious to understand how it this licensing works.
Reminder that GPLv3 gives you 30 days to "cure the violation," while the GPLv2 Linus Torvalds prefers immediately creates a copyright violation.
Note that GPLv3 still immediately creates a copyright violation. It just states that if you cure the violation within 30 days, the license is reinstated. Under GPLv2 you forfeit your license immediately and in perpetuity if you infringe on it.
So, this gem uses the mime database provided by freedesktop.org when the gem could have got the database from http://www.iana.org/assignments/media-types/media-types.xhtm... which wouldn't be GPL? What manipulation is done by freedesktop.org?
The gem is basically a database of mime type, file extension, and magic bytes. The last two are not included in the linked iana database.
Where did freedesktop.org get the magic bytes? I assume (probably stupidly) that some of that has to be in a file command on some BSD.
Here's what the magic bytes look like in FreeBSD, as an example I've linked to the definition of PNG. You can see it's a fair bit more complicated but it does have mime and extension data.
https://github.com/freebsd/freebsd-src/blob/master/contrib/f...
vendor your dependencies people
Or you know... just cache them.
If your CI or deploys broke because of this, it basically means you're constantly re-installing all your dependencies from scratch, which is totally silly.
Configure your CI & other tools to cache the bundler directory between builds and not only you'll be protected from this, you'll also make your systems faster.
There can be other wrinkles. Builds of $dayjob's Rails app failed even though we had the gem itself cached locally all over, because entries for the BSD gem versions had vanished from the rubygems.org metadata, and we weren't caching that. There are multiple workarounds, but we may not be done debating them before rails-core decides on a replacement.
What's even more silly is implying that caching your dependencies is some kind of a fix here. So you'll be able to deploy for a few more days, then what?
Then the 0.3.6 version was released, and we could update because the license isn't much of a problem in this case.
Alternatively we could have continuee to deploy for a few more days until Rails core shipped a new version of Active Storage without that dependency.
It's effectively the same as vendoring, except we're not blowing up our repositories with 10 years of dependencies.
I should clarify... Vendoring is a terrible solution. Caching is a general good practice but not a solution.
All in all, this kind of thing happens once in a blue moon...
I don't claim it's a long term solution, and that you should run of your cache for years.
But having that cache means you have the time to get a clean solution rather than having to rush something because nobody can work anymore can you can't deploy fixes to production either.
These yanking issues are really problematic, but if they interrupted your workflow, then there's something wrong with your build pipeline.
It allows you to deploy for a few more days. That fixes the "i can not deploy problem" if my customer needs a urgent fix.
Sure you will be as much in violation of GPL as without that (even if you don't deploy you will violate the GPL), but that is another problem which needs to be addressed.
> That fixes the "i can not deploy problem" if my customer needs a urgent fix.
Only if the issue is fixed upstream before your cache expires. Obviously, breaking Rails gets things fixed quickly, but what if it was a less-actively-maintained gem?
Please, no.
Vendoring gems solves lots of problems: GitHub/RubyGems outages, yanked gems, credentials sharing in CI/CD, and as a bonus, deployments are quicker.
Negative side effects: You need to update your vendor cache periodically, your repo increases in size, and native gems have problems if if you develop on a different platform than you deploy.
Outrageous move to just yank the gem and break builds everywhere.
It is not nice, but aside from hobbyists everyone who seriously develops software caches all dependencies in a own repository like nexus etc.
It's similar to backups, if you don't have one your data must be worthless.
> aside from hobbyists everyone who seriously develops software caches all dependencies in a own repository like nexus etc.
This is a bold claim to make, and one that isn’t supported by my personal observations. Many ‘serious’ software developers have no such intermediate repository for their dependencies.
We did and so this didn't cause us a major issue today.
At my lost job we had the same.
And the one before that.
This mitigation of a risk that affects business continuity is something that all senior level people need to take seriously at any company, small or large.
This is flamebait.
TLDR: the mimemagic gem was MIT licensed, but an issue was opened where it was reported that mimemagic is using a GPLv2 source file. Legally (IANAL) this forces mimemagic to become GPLv2. The mimemagic gem was changed to GPLv2.
However rails depends on mimemagic, and that means rails needs to be GPLv2, which is obviously a big problem. The discussion around this is taking place in the github repo for rails because mimemagic was archived for some reason (at least temporarily).
Looking through the linked comments & issues, it will be interesting to see how many people blindly adopt (forced into) GPL2 license. I wonder how big the R for spreading is? All from an XML file for mimetype info
I give my boss a hard time about our dependency management system because it is relatively unknown[0], but licensing is built into it from the ground up. You can't import any dependency (no matter how buried) without assigning a license to it.
This lets us confidently know, via software, the open and closed source licenses in our code base.
Licensing is one of those out of band concerns that doesn't burn you until it does.
> You can't import any dependency (no matter how buried) without assigning a license to it.
That wouldn't help here. Mimemagic declared itself to be MIT, and only turned out to be GPL because it embedded a file derived from GPL sources. That file didn't even have a license header specifying it as GPL.
Anyone importing it would mark it as MIT.
EDIT: Mimemagic didn't even turn out to be GPL, it turned out to be infringing on the GPL, and the author solved that by relicensing it to GPL.
Fair point. I guess software couldn't have helped this issue.
That's a good idea generally, but it wouldn't have saved you from this issue. The gem had an MIT license, and the offending file was copied in, not sourced through a dependency.
Depends on the process. If assigning a license means that there is a review of the dependency before use, this is normally seen.
Gotcha, fair point. I should have read deeper; I only read the Rails issue, but should have dove into the mimemagic one. My bad.
You're correctly getting downvoted for your thinly-veiled advertisement because it's besides the point. The gem was labeled as "license: MIT" all the time, but that label was just factually wrong. Garbage in, garbage out.
People can of course downvote me for whatever reason they want. I do object to you calling this an advertisement. Sure, it links to my employer's website, but it is an OSS build library that addresses licensing issues.
Yes, GIGO applies, but what about the underlying Gem? Was the XML file relicensed? (Haven't dug into that issue 97 referenced in the GH thread.)
Edit: from sibling comments, looks like this build system wouldn't have caught the underlying issue, which was a GPL licensed, unmarked file copied in.
The product in question being open-source or not has nothing to do with whether it's advertising. I would have been okay with it if your product genuinely did something to solve the issue at hand, but as you admit, it does not, so all that's left is "this is a good time to mention my product".
The build tool I linked to would not have solved the issue. I should have read further into the GH repos to understand how the Gem was contaminated. That's my fault.
I have thoughts on the applicability of the build tool mentioned and its relation to licensing issues in general, but I feel I'm repeating myself. I'm not sure continuing this conversation will be productive, so I'm gonna stop.
How do you handle transitive dependencies?
When you are importing a jar file (this is a java build tool) you have to specify the provenance of every dependency, and any dependency of those dependencies.
It's a real pain, but the pain is all upfront.
Technically everyone using Rails right now may be in violation of the GPL. It doesn't matter that the version of the gem being used claims to be MIT, that's not how licensing/copyright works.
Github Enterprise licensees could try hit-up GitHub for source code!
EDIT: License in question is GPL, not Affero GPL. So github.com is not covered. However, Github Enterprise is.
In all likelihood, Github wouldn't comply, as Github Enterprise licensees have no such license/clause in effect with Github. It'd then be down to the shared-mime-info's copyright holders to take the matter to court.
Would be an interesting court case.
Some people in the Github issue commented that the XML "database" in question could be used under fair use. That'd be the logical defense. There's been many court cases where the "copyrighted material" is a representation of facts, as opposed to a "creative work", and thus has not been eligible for copyright protection.
It's probably also worth noting that "ignorance" is rarely an accepted defense in court.
> everyone using Rails right now is in violation of the GPL.
Not if your use of Rails is limited to your own/your company's own servers, which I imagine most Rails users are. Please don't fall for the flamebait.
If they were using GPLv3, they would have an entire month (30 days) to cure the violation.
GitHub Enterprise is in violation as it is distributed to third-parties.
Thanks, edited for clarity. In this case it's GPLv2, so no "cure the violation" clause.
However, even if this were GPLv3, and you were to "cure the violation" that only reinstates your license i.e. the GPLv3 license. Replacing a dependency won't make previously infringing releases any less infringing.