Flexible Paxos: Quorum intersection revisited

89 points by mgrosvenor 10 years ago · 14 comments

Reader

rusanu 10 years ago

TL;DR summary (to my understanding, no sane human can ever claim it can summarize Paxos):

The claim is that, once a leader is elected (ie. Q1), is no longer necessary to attain a majority quorum for actually accepting writes (ie. Q2). A minority quorum can accept writes, provided the minority contains at least one node that participated in the leader election. By increasing the leader election quorum number to higher than N/2+1 (as to have a sufficient number of nodes that participated in the Q1 election), the cluster can then operate much faster because writes require only minority quorum. The drawback is that it no longer tolerates N/2-1 failures, as N/2-1 failures leaves too few electors to choose a new leader in Q1.

NB. the Paxos terminology uses terms like 'decide a value', but practically in clusters this is equivalent to 'accept writes' so I used that instead for easier comprehension.

stelfer 10 years ago

Haven't read it carefully, so I might have missed the reference, but Barbara Liskov has described a very similar optimization for Viewstamped Replication[1]
[1] http://www.pmg.csail.mit.edu/papers/vr-to-bft.pdf
btrask 10 years ago

If your summary is accurate (haven't read the paper yet), I don't think this works, because if you don't have a proper quorum, you can't know that the leader is still valid at the time of an event. It might've been re-elected in the meanwhile. The only way to know is to "check in" with all the other nodes.
I recently proposed this idea (informally) and had to retract it: https://bentrask.com/?q=hash://sha256/b40971e7b30324fdda15ce...
Disclaimer: totally not an expert.
- jlouis 10 years ago
  
  The nice thing about the paper is that it is transparent. It includes a TLA+ specification for the claims that they make.
  In turn, you have a concrete model checked implementation you can talk about or use a basis for understanding where either their or your proposed idea either holds or fails.
  Note that a model can specify the wrong correctness criterion in practice. So you may have a "proof" which works, yet the proposition proven is wrong.
  In general, when working on distributed systems, we usually want some kind of formal criterion of correctness. The failure modes of such system are quite hard to get right, and hence retractions of claims are plenty. Sadly, we have too little literature on how to on-board people on model checkers such as TLA+, proof assistants such as Coq, Isabelle/HOL or NuPRL, and QuickCheck systems such as Haskell or Erlang QuickCheck, Clojure's core.spec (I believe), etc.
  - rusanu 10 years ago
    
    Book recommendation: Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers [0]. I did read it, but unfortunately I can't say I actually understood it as in to be able to use it day-to-day.
    [0] https://www.amazon.ca/Specifying-Systems-Language-Hardware-E...
  - jpitz 10 years ago
    
    It is appropriate that Lamport's tool is being used to prove the correctness to an improvement of Lamport's algorithm.
jpitz 10 years ago

Lamport can probably claim to summarize it as well as anyone: http://research.microsoft.com/en-us/um/people/lamport/pubs/l...
amelius 10 years ago

How much speed improvement would such an algorithm give in practical circumstances (for instance assuming that network outages are rare)?
- jlouis 10 years ago
  
  Good question. Results such as these can be of a form where they currently don't provide much benefit, but yet moves the knowledge of Paxos-like consensus forward. That is, you explore where the system has flexibility in its very nature. In turn, this knowledge can lead to faster systems down the track. Sometimes by employing a vastly different approach due to the knowledge. Good research can be indirect.

rusanu 10 years ago

Related discussion of the 'A More Flexible Paxos' blog quoted in the paper: https://news.ycombinator.com/item?id=12292590

rollulus 10 years ago

A friendlier story here: http://hh360.user.srcf.net/blog/2016/08/majority-agreement-i...

TimFogarty 10 years ago

There's an easy to understand visualisation of distributed consensus (specifically the Raft algorithm) here: http://thesecretlivesofdata.com/raft/

arthursilva 10 years ago

Not a paper, but I liked this related article https://news.ycombinator.com/item?id=11813180

Settings

Flexible Paxos: Quorum intersection revisited

Keyboard Shortcuts