Go At Heroku (Doozer)
blog.golang.orgDoozer is a really interesting project. We run a Go based lockserver at Tinkercad but we haven't got around to adding redundant storage. Doozer on the other hand is tighly focused on just the storage part, which usually is much less application specific. So for us this seems like a good fit.
I've been talking with Keith and Blake since before christmas and their thinking is solid. If you are interested in running a consistent store this is probably one of your best bets, unless you happen to work at Google and have access to chubby (I'm an ex-Googler).
Very interesting post. I'd advise you to go read Keith Rarick's introduction to doozer http://xph.us/2011/04/13/introducing-doozer.html. I think this could gain traction as a credible HA data store. As stated paxos is part of the core of doozer, its something that's also used by google's lock service chubby which is used in systems like GFS, bigtable and map reduce.
It's not clear to me how, or if, failure detection is to be integrated with Doozer. Keith has said on the thread accompanying the announcement blog post that Doozer doesn't have the 'baggage' of sessions (which are used in ZooKeeper to manage timeout failure and the removal of certain kinds of data which can be used to model lock revocation).
Without some way of knowing if a process fails, it's hard to do leader election, locks, other synchronisation patterns. It would be a reasonable design choice to do failure detection completely out of band with the sequentially consistent store, but I'd like to understand their architecture better.
Just curious, how familiar are you with Paxos? I'm asking because failure detection is pretty much isomorphic with distributed consensus and generally considered hard. Ie. how do you differentiate between down and slow?
Extremely familiar - see my articles at
http://the-paper-trail.org/blog/?p=173
and http://the-paper-trail.org/blog/?p=190
for some tutorials I wrote on the subject.
You're correct that failure detection and consensus are very deeply related, in that a strong failure detector is 'sufficient' for consensus.
But my point is about client failure detection, not failure detection between servers (which must have some kind of timeout system; that's ok - you just sacrifice liveness in a few pathological cases rather than sacrificing correctness). If I am to implement leader election with Doozer, does Doozer provide any tools to help us with deciding when to elect a new leader? There's no reason it should, but ZooKeeper, for example, does have that in its arsenal.
Doozer doesn't, AFAIK, expose consensus as a primitive; that's not its model. So the fact that it uses Paxos, or ZAB, or 2PC or whatever doesn't make a difference to its clients.
Replied to you via email.
Please see: http://xph.us/2011/04/13/introducing-doozer.html
That was the introductory blog post to which I referred - note the similar discussion in the comments.
Its nice to see this getting some attention, great job Keith and Blake! I'm hoping to use Doozer in our infrastructure at some point in the future.
How is this different from ZooKeeper in terms of features offered or possible use-cases?