Settings

Theme

MySQL Bug #11472: Triggers not executed following foreign key updates/deletes

bugs.mysql.com

102 points by gyosko 11 years ago · 57 comments

Reader

pilif 11 years ago

if you rely on triggers for data integrity, then I guess this is one more case where MySQL is letting you down and allowing your data to get corrupted unless you're really careful.

While it's not always good design to use triggers for this, sometimes, it's a valid reason to use them for integrity checking or enforcing. Having `on delete` triggers not run for some delete's is violating the principle of least surprise.

When feature break this way, people start to distrust them and best practices get adopted that discourage using them, killing the features all together.

Using a database that does not have this misfeature, both triggers and foreign keys are perfectly safe to use, but because the one database that is the most widely used has issues like the one described here, you often hear the recommendation to not use triggers or even foreign keys.

Yes, you can potentially move the logic normally contained in either of them into the application, but as you deal with more concurrency and as you start accumulating bugs, sooner or later your data will be messy and you will need to add cleanup-scripts, or, heaven forbid, work around messy data in your application logic (don't. do. that. it's the path to the dark side of unmaintainability).

Or, of course, use a database system whose features work correctly and the principle of least surprise is in effect.

PostgreSQL is one of these, by the way.

  • SFjulie1 11 years ago

    Apple uses MVC with sqlite for the contacts. The view is cocoa, but every events int the GUI are calling triggers in sqlite.

    It makes apple's application simple stupid and efficient.

    And they have triggers on foreign keys... with sqlite.

    ...

    It is not postgres in one of it, it is much more mysql is not a correct rdbms. None of the other databases have this flaw.

    Mysql is to RDBMS what mongo is to NoSQL, nowhere short of achieving anything that matters correctly.

    EDIT: or better: Mysql is a realistic in the Hollywood way implementation of a RDBMS

    • spotman 11 years ago

      mysql is good for very high transaction rate platforms, that mostly rely on simple features. for this, it generally outperforms most of its competition in the traditional RDBMS space, and why companies like twitter and facebook use it. mysql likes simple, well designed queries.

      postgres usually outperforms mysql for more complex operations, but is not as fast as mysql for simple ones.

      sqlite is really only appropriate for things like mobile or desktop applications or things with generally low concurrency requirements.

      i would be careful throwing the baby out with the bath water in your dismissal of mysql, but thats just my opinion:)

      can you shed some more detail on that apple uses triggers with sqlite? that is interesting to me, and would like to learn more.

      • SFjulie1 10 years ago

        For apple: jailbreak go to home dir apt-get install sqllitex.y apt-get install bash find list all the files / dir find the contacts read the man of sqllite and have it tell gracefully the content of the table Something like this: http://beauty-of-imagination.blogspot.ca/2014/03/backuping-m...

        Well it is true mysql is fast ... when it is not a problem.

        They sacrificed correctness (stuff about big O notation in worst case) for benchmarks... and it works ... amazingly well as long as you are not under heavy load or give up on integrity, or relationship, or correctness.

        Amazing. As long as you store data that need not to be transactional, or relational, or needing integrity mysql is fast. When data matters, or load is heavy mysql is just not there. It is chaotic, inconsistent, unicode retarded... full of pits... and resource greedy.

        Sure, if you are a masochist, or if you fear to loose your job and need to elevate pain to the rank of art, then mysql is alright.

        I have been sysadmin as long as developer and on both sides of the track mysql is insanely not consistent. (sysadmin: 400 config parameters, replication that fear any butterfly flapping its wings near a network cable, dev: collation behaviour, (clumsy) LDAP integration for auth, inconsistencies...).

        I even prefer MSsql to mysql. I have been working almost only in linux environment since 2000, and I still think MySQL is crap. 10 years of horror stories with mysql made me hate it as much as taking LSD before trying to fill your taxes.

        • spotman 10 years ago

          There is no SQL triggers in the article you post here. Confused at what your trying to say here. Maybe you can elaborate?

          • SFjulie1 10 years ago

            well if you ask politely sqlite to show the tables and paginates, then you have the trigger.

            For a short vademecum I was not going to add noise in the post.

      • contingencies 11 years ago

        sqlite is really only appropriate for things like mobile or desktop applications or things with generally low concurrency requirements.

        I think what you mean here is low concurrent write requirements.

        (Edit in response to below: Isn't sqlite file oriented and read-only opens eschew locking? Thus replication via network filesystems should be adequate, no?)

        • spotman 11 years ago

          Yes, agreed, if we are talking about straight concurrency only.

          Other things like replication, are also not there, so it's not what I would choose for a website database, but if you can serve all your reads from one server, and don't care about high availability, then go for it.

  • kailuowang 11 years ago

    > Yes, you can potentially move the logic normally contained in either of them into the application, but as you deal with more concurrency and as you start accumulating bugs, sooner or later your data will be messy and you will need to add cleanup-scripts

    I agree with most of what you said, but in many many use cases data integrity can be maintained in application code (a statically typed system helps), it simply can't be true that all those applications running on NoSql database will have messy data and/or cleanup-scripts.

    • spotman 11 years ago

      furthermore, foreign keys are not performant at mass scale on both pg or mysql or any traditional RDBMS. if your trying to do several thousand ops per second, the vector of index contention greatly goes up if there is FKs littered all over your schema.

      nosql or sql large platforms almost always have some amount of work always ongoing to clean, shuttle, and maintain data integrity.

      that doesn't make me not a fan of them; for lower transaction rate applications ( which most probably are ), they are a small time investment to have some sanity checking. but it's a right tool for the job thing.

      this is yet another reason why people dislike FKs in best practice stuff.

      Edit: Hi down-voters. Curious to hear why I'm being down voted and possibly be offer a rebuttal?

      • ak39 11 years ago

        Foreign keys are not convenience artifacts. They are essential for enforcing data correctness. One doesn't simply opt out of their usage simply because one is "not a fan" of foreign keys.

        • delinka 11 years ago

          I've experienced several highly-performant, correct databases with concurrent clients from around the globe ... and they had no enforcement of foreign keys.

          Yes, foreign keys were essential to data correctness. No, they were not being enforced in production. Stew on that just a bit. Not enforced in production.

          During client application development, development databases enforced the constraint. Any errors resulting in foreign key mismatches disqualified the client app for release. The applications were required to be aware of the constraint and to learn to work within it. Production gained the speed benefits of not having to enforce the foreign keys.

          You can indeed eat your cake and have it, too, iff you are disciplined.

          • ThinkBeat 11 years ago

            This might be the case if you have one and only one application interacting with the database, and your unit tests include each and every permutation/edge case.

            It is very common to have many applications, from different vendors and from different time periods interacting with a database.

            It often also happens that data can be modified directly in the database, not via the application. I agree this is not best practice but in larger enterprises it happens.

            Once you start implementing foreign keys in the client you are adding a lot of complexity to a problem that most mature relational database system do well enough to be trusted by a lot of traditional financial institutions.

            • delinka 11 years ago

              These kinds of access are not proper discipline. And a lack of proper discipline is the reason to enforce constraints in production. As for "adding complexity to client apps" - how could you possibly create an app to interact with a database and not know about the structure of the database? This is beyond irresponsible. At the least, should an institution desire to allow third-party access to their database, then that access should be provided through a gateway with the intelligence to enforce constraints without relying on the database server.

            • needusername 10 years ago

              > It often also happens that data can be modified directly in the database, not via the application.

              Yup, 3rd level support "winging it" in production.

              And also migration scripts.

          • contingencies 11 years ago

            Great tale! Much respect to the development and operations teams here. Out of interest, what kind of purpose was the database?

            • delinka 11 years ago

              It was in a financial institution. It held every transaction in the financial enterprise.

          • spotman 11 years ago

            Nicely put. 100% agreed.

        • spotman 11 years ago

          What if it is the difference between your schema working under high transaction rates or not? Or, the difference between buying much more database hardware and having smaller shards to avoid index contention. (which, is $$)

          Facebooks mysql architecture (at least used to, likely still is) based on this: https://backchannel.org/blog/friendfeed-schemaless-mysql

          Note the simple schema, and lack of FKs.

          FKs are rarely present in extremely high transactions per second systems that operate on the same few tables. This is due to index contention and locking.

          I say this with lots of experience working on different production systems that see hundreds of thousands of transactions per second. I have yet to see one arrive at these kind of numbers using foreign keys, unless it's something like a giant shared hosting platform, that is operating on hundreds and hundreds of different tables. (therefore, less index contention)

          Often, in these shops, data correctness is not validated in realtime in this way, but often in a way that is not in the critical path of answering queries. (more like, eventually consistent). Some places have entire teams for this.

          To your point on correctness, my experience ranges a lot in areas where correctness can be eventually consistent. For something like healthcare or banking, you would rather spend the money on way more hardware, because you can't afford correctness to be off. However, with things like globally scaled social apps, this is just not the case.

          Most people tend to still use Oracle in those situations. Which is legions slower , and legions more expensive than mysql or pgsql.

          • ak39 11 years ago

            >To your point on correctness, my experience ranges a > lot in areas where correctness can be eventually consistent.

            "Eventually consistent"? Explain. (Do you mean in an ACID transaction?)

            • spotman 11 years ago

              Let me explain better. As eventually consistent is probably the wrong term for what I mean.

              In things like a large social media app where the reality is no FKs, and 100k queries-per-second, for instance, you would write into your application logic how to deal with child rows and parent rows, and not rely on FKs to raise an exception, or perform cascading deletes and things.

              So if ( when ) there is either a bug, hiccup, or a variety of other reasons that can as you mention make things question correctness (or specifically: referential integrity, if we are going to nitpick), these are usually cleaned up out of band.

              So you may have a table you have to slowly iterate through later and remove rows in which their parent rows no longer exist anymore, as an example.

              In something like the friend feed schema used in the previous example, read up on how that works, as you will see they sort of turn SQL on its head a bit, to make it more flexible and deal with these shortcomings.

              • Pamar 11 years ago

                On the other hand, if you deal with money, or physical products (like a car) or anything slightly more "important" than your "likes" count on a social site, eventual consistency would get you laughed out of the room pretty quickly.

                • contingencies 11 years ago

                  In fact a lot of financial stuff basically works on eventual consistency, they just don't call it that, wear suits and carry serious expressions. What is double-entry bookkeeping? What is an audit? What is end-of-day? What is end-of-financial-year? What is invoice due date? What is multi-currency accounting? What is a market order on an exchange before you learn what it was sold for? What is a credit card transaction that can be disputed? What is a tax return? ... etc.

                • spotman 11 years ago

                  Agreed overall, see my comment above which addresses this. These types of systems typically will never see as many transactions/sec as their “likes” counterparts, for obvious reasons.

                  Having said that, I have worked on some credit card infrastructure that does get up there, and it too, does not have FKs. Having or not having FKs does not magically mean your data will be accurate or in-accurate. ACID, as well as traditional database transactions, still apply, and keep your data integrity high. In conjunction with well designed application logic, and proper planning, FKs are not needed to maintain accurate data whatsoever.

                  Also, there is commerce related things (Amazon) that work on truly eventually consistent data stores, like DynamoDB. Don’t see amazon getting laughed at much. (they use it for their shopping cart, source: http://www.allthingsdistributed.com/2007/10/amazons_dynamo.h... , this paper is also a great read fwiw)

                  Anyways, I think we are getting a bit into the weeds here, as the OPs link is about a mysql bug. Cheers.

                  • jeltz 11 years ago

                    Well, the shopping carts do not contain data which must be correct so your example does not contradict Pamar.

                    • spotman 11 years ago

                      I agree with you in spirit, but it's simply inaccurate to suggest that a shopping cart does not deal with mission critical data.

                      the fact is, Amazon does use Dynamo for a mission critical financial purpose. If Dynamo lost data Amazon would not use it for arguably one of the key pieces of their success.

                      the article I linked covers how they do this, and how it stays accurate, scalable, and robust.

                      anyways cheers, I hope we have not hijacked this thread about MySQL bugs too much.

                      • Pamar 10 years ago

                        I am afraid we are comparing apples to oranges.

                        First of all, any single shopping cart is typically accessed by one single process at a time: the user's browser. I can of course open two or more browser sessions, but this is not a typical use case, it definitely has an upper limit (how many browser tabs can I use concurrently?) and most importantly, lost or duplicated transactions are easily spotted and corrected (you do get plenty of chances to revise your order before paying).

                        A bank account can be "hit" by different sources at the same time (just think of a company bank account, not just a personal one) and most importantly, duplicated or lost transactions are much harder to prove and rectify.

                        • spotman 10 years ago

                          The browser tabs have an upper limit sure, because on the client/browser side, it just might be unfeasible to open more than 100 tabs realistically.

                          But, I can assure you, you could open 1000 tabs, and dynamo will be fine, and not lose your data.

                          You should really read the article I linked, especially the part about vector clocks, it discusses a common approach for dealing with duplicate data, conflicts, and resolution, at scale.

      • Jweb_Guru 10 years ago

        I am afraid your premise is incorrect. Foreign keys can be enforced (particularly on insert-only or insert-mostly workloads) with very little contention, even in a distributed system: http://www.bailis.org/papers/ramp-sigmod2014.pdf

        And lest you think there is some hidden constant factor here that destroys real performance at the cost of "scalability," the techniques from that paper were used to completely destroy the TPC-C benchmark: http://arxiv.org/pdf/1402.2237.pdf

        Admittedly this stuff is not yet implemented in traditional RDBMSes, but there is absolutely no reason why it could not be and I fully expect it to in the future. Certainly, if you are having performance problems now, this knowledge doesn't help. But I'd venture to guess (from personal experience) that most people removing foreign keys are doing it preemptively, rather than evaluating whether it is actually causing performance problems.

Twirrim 11 years ago

It would be really awesome if MySQL crew could spend a few months (half a year?) working on paper-cuts:

https://bugs.mysql.com/search.php?search_for=&status=Active&...

That's all the active bugs against MySQL. Note that this 10 year old bug isn't the oldest.

This is the oldest, from March 2003: https://bugs.mysql.com/bug.php?id=199 Note someone submitted a fix for that over a year ago (and verified under the OCA in November) but it still hasn't shipped. This is a bug that should never have taken 11 years to fix.

https://bugs.mysql.com/bug.php?id=3052 This one is great. ROLLBACK in a stored procedure doesn't close the cursor.

MySQL, for all its strengths, often feels like there is a bunch of typical geeks running the show. Lots of focus on the 'sexy' new features, not so much focus on keeping the lights on.

morgo 11 years ago

This is caused by a layering problem in MySQL itself:

- Triggers are at the SQL Layer

- Foreign keys are implemented natively by InnoDB at the storage engine layer

It is not as easy to fix as it sounds.

  • kstrauser 11 years ago

    It's probably not, but I'd rather not have the feature at all than think that it's there and working but actually not doing what I've asked it to. The current short-term fix would be to make it raise an error when you try to create such a trigger, so at least no one would be using it for now.

riffraff 11 years ago

Firefox's issue #106400 (osx keychain integration) will turn 14 this year, AppEngine's #3091 (servlet 3.0 support) has already turned 5.

I guess every project as a few "we'll fix it some day" tickets.

  • jessaustin 11 years ago

    Somehow those seem less important. Maybe it's because I cut my teeth on Sybase and Oracle rather than noSQL, but if triggers don't work, I would really hesitate to call that a RDBMS.

  • needusername 10 years ago

    I don't think it's fair to compare MySQL to GAE/J. MySQL is actively developed by a company, marked and sold with support.

xyby 11 years ago

The question is if one should "fix" this at all. If you change the behavior from "foreign keys do not activate triggers" to "foreign keys activate triggers", everyone who uses triggers will have to audit their applications for potential problems. And for large applications, that can be a lot of work.

MariaDB documents "foreign keys do not activate triggers" as the standard behavior:

https://mariadb.com/kb/en/mariadb/trigger-limitations/

Personally, I am grateful for software that changes as seldom as possible. I don't want to spend time on "updating" my application because something down in the stack changed.

  • LoSboccacc 11 years ago

    Well what does SQL standard say about that?

    Application can just not update, so there is no need for windows level of compatibility management

    Personally never jumped on the mysql train and can say I'm really glad for that.

mianos 11 years ago

Mysql has its uses. Keeping a list of people you don't know on a social network is a great use case. Recording cat video URLS is another. Keeping track of money is not a good use case for mysql. Simple as that. My favourite is this one is the correlated subquery bad query plan selection: http://bugs.mysql.com/bug.php?id=9090 "It's not a bug, it's a feature". You can always find a way to work around this stuff, or you can use postgres or another db that considers thi stuff important.

zzzeek 11 years ago

triggers...bah. how about CHECK constraints? (http://stackoverflow.com/a/2115641/34549) The irony is that the workaround for CHECK constraints being silently ignored is to use a trigger.

icelancer 11 years ago

Classic:

[30 Jun 2005 19:04] Dmitry Lenev We will fix this in 5.1

cremno 11 years ago

In a few days #20786 will have its 9th birthday.

https://www.youtube.com/watch?v=oAiVsbXVP6k

https://bugs.mysql.com/bug.php?id=20786

reisub 11 years ago

Does anyone know if this is fixed in MariaDB?

walrus 11 years ago

I don't understand some of the people commenting on the bug report. If they really want it fixed, then they should fix it. Making rude comments isn't going to make it go any faster.

  • nightski 11 years ago

    My guess is MySQL has quite a few users who are not technically capable of fixing a deep and involved bug in the heart of a database engine.

    • protomyth 11 years ago

      and I would imagine some of the people filing bugs are people who are paying for support given MySQL and, later, Oracle's business model

  • lucb1e 11 years ago

    As someone who had issues with a 4+ year old bug in FileZilla, can confirm. First time I developed anything serious in C++ and it took me only two evenings to develop a patch. Then some back and forth on the bugtracker, but it made it in soon after.

    (For those interested, it's the ability to create a new, empty file on the server.)

    • breakingcups 10 years ago

      Surely the two issues aren't of the same complexity.

      • lucb1e 10 years ago

        I don't know how the trigger system is structured under the hood, but I could imagine it being somewhat comparative if you can properly call them. At least one of the commenters could have tried something or asked for pointers where to find it, that triggers others to actually work on it or at least help them work on it.

  • infamouscow 11 years ago

    Exactly.

    One of the major reasons to pick open-source software over a proprietary software is you can fix bugs that the vendor or developers won't.

_lflx 11 years ago

See you all in another 10 years.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection