Settings

Theme

TiDB: A Raft-based HTAP Database [pdf]

vldb.org

50 points by Lilian_Lee 5 years ago · 7 comments

Reader

Lilian_LeeOP 5 years ago

Here's more reference: https://pingcap.com/blog/how-tidb-htap-makes-truly-hybrid-wo... This post introduces the design details of the HTAP architecture of TiDB, including the real-time updatable columnar engine, the multi-Raft replication strategy, and smart selection.

  • sitkack 5 years ago

    These hybrid databases are really exciting. I also really like how mature pingcap is from a technical perspective. On the benchmarking page this paragraph really stood out.

    > If you don't have reproducible, fair benchmarking, you can be blinded by your own hubris. Our contributors and maintainers depend on benchmarks to ensure that as we strive to improve TiDB, we don't negatively impact its performance. To us, not having benchmarks is like not having logging or metrics.

CodesInChaos 5 years ago

One important limitation is that TiDB only offers snapshot isolation, not serializability.

  • nextaccountic 5 years ago

    Is serializability desirable in a distributed database?

    • CodesInChaos 5 years ago

      It's just as desirable as with a simple database since it avoids certain anomalies (write skew). For many operations it's strong enough, but it adds the risk that you run into problematic anomalies you did not anticipate.

      > In a write skew anomaly, two transactions (T1 and T2) concurrently read an overlapping data set (e.g. values V1 and V2), concurrently make disjoint updates (e.g. T1 updates V1, T2 updates V2), and finally concurrently commit, neither having seen the update performed by the other. Were the system serializable, such an anomaly would be impossible, as either T1 or T2 would have to occur “first”, and be visible to the other. In contrast, snapshot isolation permits write skew anomalies.

      FoundationDB has a workaround, but I don't know how well that works in practice.

      > In TiDB, you can use SELECT … FOR UPDATE statement to avoid write skew anomaly. In this case, TiDB will use locks to serialize writes together with MVCC to gain some of the performance gains and still support the stronger “serializability” level of isolation.

      Serializability is more expensive of course, since the database needs to track what was read and not only what was modified. It is possible to achieve this in a distributed database, FoundationDB is one example that supports it.

      https://tikv.org/deep-dive/distributed-transaction/isolation...

balboah 5 years ago

I got the TiDB coffee mug, it looks better than the MongoDB mug

  • 082349872349872 5 years ago

    Tandem (who sold high-availability via redundant hardware) coffee mugs had two handles. What are some other companies whose mugs reflected their value proposition?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection