Settings

Theme

How TokuMX was Born

tokutek.com

13 points by dataviz 12 years ago · 7 comments

Reader

rogerbinns 12 years ago

I'm looking forward to when TokuMX is "ready", and especially hope it gives MongoDB the kick they deserve.

I did try TokuMX over a month ago and it was a dismal failure. It used considerably less space (good), imported data quicker (good) but failed at runtime after a few hours claiming issues with locking. Our code doesn't use locking and was running exactly what runs against MongoDB just fine.

  • zardosht 12 years ago

    Roger,

    I work at Tokutek (and wrote the post above). I'm sorry you ran into issues trying out TokuMX. I assure you, we are "ready", as we have users running in production.

    Nevertheless, you ran into problems and that is unfortunate. If you have details, can you please share them with the tokumx-user google group? We might be able to help. I suspect the transition to using a transactional system like TokuMX where entire statements are transactional is resulting in some "gotchas", but that is just an educated guess.

    -Zardosht

    • rogerbinns 12 years ago

      I mean ready in the sense that pointing code that worked flawlessly against MongoDB to TokuMX then just works flawlessly too.

      I uninstalled Toku and went back to MongoDB so I can't provide any further testing. (The mongorestore takes days.)

      I can tell you want code was running at the time. It reads events sorted by user id and timestamp, and then discovers session boundaries in that. A new session object (in a different collection) is written out with all the events as a subdocument list. (In rarer cases an existing session object is updated.) This was happening in 8 separate processes all in Python/pymongo. There are no statements running that affect more than one document, nor any need for transactions.

      • leif 12 years ago

        If you were using upserts I expect you were having problems due to the optimizer retrying all possible plans (including table scan) periodically. This is reflected in https://github.com/Tokutek/mongo/issues/796 and is fixed in 1.4.0. If you'd like to try another evaluation, get in touch with us and we can help you track down whatever problems you see.

        Not all mongodb code will optimally use tokumx without any changes. Concurrency is hard and mongodb encourages some patterns that are bad for any concurrent database. For example, count() for an entire collection is not, and could never be, as cheap in a concurrent database like tokumx as it is in mongodb.

        • rogerbinns 12 years ago

          Thanks for the offer, but the mongorestore times (against MongoDB) being over a week makes this too risky.

          The code making changes was insert (mostly) with a few upserts, but the latter was by _id. My hypothesis as to the cause is that tokumx adds implicit transactions and then there are some arbitrary restrictions around those transactions (eg how many outstanding at once, timeouts in lock acquisition) and after a few hours one of those was hit. The error message was something about being unable to start a transaction.

          > Not all mongodb code will optimally use tokumx without any changes

          The goal wasn't to be optimal or anything like that. It was initially about space consumption (where you did really well) and verifying the same client code ran correctly. We have two setups so one would run toku and one mongodb and data processing results compared.

          • leif 12 years ago

            Ok. Well, you said you were waiting for it to be ready, and I think it is. We'll be here when you get a week free to tinker.

jontobs 12 years ago

Very informative! Sounds like great technology!

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection