Implement two-phase commit (2pc) by levkk · Pull Request #419 · pgdogdev/pgdog

2 min read Original article ↗

Changelog

Two-phase commit

Also known as 2pc or 2-phase commit. Postgres supports it out of the box by enabling max_prepared_transactions:

ALTER SYSTEM SET max_prepared_transactions TO 1000;

Once enabled in Postgres, enable 2pc in PgDog as well:

[general]
two_phase_commit = true

How it works

2pc splits the COMMIT command into 2 statements:

  1. PREPARE TRANSACTION '<unique name>'
  2. COMMIT PREPARED '<unique name>'

When the client sends COMMIT to us, we intercept it and run statement 1 on all shards. If that succeeds, we run statement 2 on all shards. If that succeeds as well, the transaction is finished and we send CommandComplete and ReadyForQuery to the client.

Why is this necessary?

Separating the COMMIT statement into two protects against single-shard failures. If statement 1 fails on any of the shards, PgDog automatically rolls back the transaction on all shards by running ROLLBACK PREPARED '<unique name>'.

If statement 2 fails for any of the shards, PgDog will automatically retry it asynchronously until it succeeds. This ensures that the transaction is eventually committed in full. This ensures that the database is eventually consistent, even if errors during this process occur.

During normal ops, the delay between statement 1 and statement 2 is very short. Other clients may see a partially updated database during that stage.

Automatic 2pc for cross-shard writes

2pc is automatically executed for single-statement write transactions. For example, if your client sends a cross-shard update, like:

UPDATE users SET admin = true WHERE created_at < NOW()

PgDog will transform this, under the hood, into 4 statements:

  1. BEGIN
  2. UPDATE users SET admin = true WHERE created_at < NOW()
  3. PREPARE TRANSACTION
  4. COMMIT TRANSACTION

This ensures that all writes use 2pc even if a transaction hasn't been explicitly started.

Disable automatic 2pc

If you don't need 2pc for single-statement transactions, either because you don't care about consistency or because you can safely retry writes in case of failure (idempotent statements), you can disable automatic 2pc in the config:

[general]
two_phase_commit_auto = false