The Slotted Counter Pattern

planetscale.com

92 points by hollylawly 3 years ago · 33 comments

Reader

tantalor 3 years ago

Basically same as "Sharding counters" (2008)

https://download.huihoo.com/google/gdgdevkit/DVD1/developers...

Also in Brett Slatkin's "Building Scalable Web Apps with App Engine" (2008)

https://youtu.be/Oh9_t5W6MTE?t=1181

ris 3 years ago

Was going to post this. "Sharding" seems like a better term for communicating the idea.
- zimpenfish 3 years ago
  
  I dunno - "sharding" would generally imply (to me, at least) that you're spreading the counters amongst different databases (or tables in a smaller context.) All the counters are in the same table here which is counterintuitive to "sharding".
  - tantalor 3 years ago
    
    Semantics
iv42 3 years ago

Also, it's basically identical to the pattern described as Counter Tables in "High Performance MySQL" (3rd edition), published in 2012.
- tantalor 3 years ago
  
  Indeed, it also calls them "slots":
  > You can get higher concurrency by keeping more than one row and updating a random row... just choose a random slot and update it
  https://www.oreilly.com/library/view/high-performance-mysql/...

_vvhw 3 years ago

Balance tracking seems to be the Achilles' heel of SQL databases.

For example, my experience has been that in the world of payments, the nature of money transactions is to debit perhaps many different accounts on the one side, but credit only a handful of accounts on the other, so that group commit can't fully amortize fsync.

Performance panaceas mostly come down to sharding, but sharding doesn't work well where you need strict updates to balances.

At work, we saw this play out several times in different systems, and decided to do something about it. We took the ledger of an open-source payments switch called Mojaloop, and extracted it as a distributed financial accounting database called TigerBeetle, designed to track financial transactions at scale.

The key performance insight was to dial up group commit. We batch up balance updates so that a single DB query can do on the order of 10k balance updates. We then fsync the batch with a single write before commit, moving the performance needle out from 1k-10k TPS to 1m TPS.

This is the advantage of a purpose-built database that's designed for counting at scale.

More information, including our design decisions are in the repo here: https://github.com/coilhq/tigerbeetle

elsurudo 3 years ago

Makes sense, but I can't help but feel it's a solution at the wrong abstraction level. It's a shame the DB can't figure this out for you.

yccs27 3 years ago

Yeah, it seems like it would be possible for the DB engine to aggregate all these increments into one update. If you have two increments by one each in the queue, why not make it a single increment by two? I'm not sure though how much computing power it would need to figure that out...
- LesZedCB 3 years ago
  
  wouldn't that break the atomic and isolated rule of ACID?
  - samatman 3 years ago
    
    Not necessarily. If both updates are in a single transaction then it's valid for the query planner to batch them, although that seems unlikely in the use case this table layout is designed for.
  - MauranKilom 3 years ago
    
    Not any more than the slotted counter pattern...
    
    LesZedCB 3 years ago
    
    how does the shape of data impact the storage applications safety contract?
cryptonector 3 years ago

The DB could figure it out IF it had monoid counter type to use as the column's type.

yccs27 3 years ago

Interesting. This basically implements a parallel algorithm for summation/counting, by calculating a partial sum in each slot, and then merging the results with every read. This approach could be applied more generally to values in a commutative monoid, and there are probably other parallel algorithms that could be implemented in a database a similar way.

ukd1 3 years ago

Cool, but this method will still get locks in whatever percent of cases, regardless of if there are slots not in transactions. In Postgres you can probably do this with slots using SKIP LOCKED; though in practice I belive you have to deal with the case where everything is locked, by falling back to waiting for a lock.

UPDATE counters SET count = count + 1 WHERE name = ? AND slot = (SELECT slot FROM counters FOR UPDATE SKIP LOCKED LIMIT 1) LIMIT 1

robocat 3 years ago

How does that avoid race conditions?
SKIP LOCKED doesn’t seem to be designed for that purpose: https://www.enterprisedb.com/blog/what-skip-locked-postgresq...
- ukd1 3 years ago
  
  Yep, it's meant for queues - a kinda popular one I maintain for ruby (QueueClassic) uses it. However, you can use it outside of that - this is basically a queue to update a column; note the caveat in my original post. SKIP LOCKED will I believe not find anything and just be fine with that if all are locked; you can deal with this, but I didn't in my example SQL.
- deepsun 3 years ago
  
  In SQL `count = count + 1` does not have race conditions. The discussion about contention (bad performance), not about race conditions (incorrect results).
hiptobecubic 3 years ago

I would definitely complain about this in code review without some performance testing showing that 1) it's actually faster than waiting for a lock and 2) it's better than just increasing the number of slots.
- ukd1 3 years ago
  
  I'd be doing the same in the case of the post; it's guarneteed to hit locks some percentage of time. Should be better than a single slot though.
  The whole point of the article was that idle-in-transaction due to locks on a counter in another transaction cause bottlenecks; skipping them using SKIPLOCKED with enough slots eliminates this. Randomly selecting them also randomly picks the locked one, causing a wait.

wcarss 3 years ago

At the extreme end of this, you could also append a new row for every event and count them. If the number of rows would be too big over some period of time, you could similarly aggregate them occasionally and clear the "scratch" table.

teej 3 years ago

This "Slotted Counter" approach optimizes for a write-contention constraint, specifically row locks. From 10 seconds of Googling it seems InnoDB has other locks it uses on INSERT, I'd first check if moving the write contention to gap locks actually helps or not.
One side benefit of this approach is that getting the final aggregate is cheap, where compacting an append-only log table might not be.
- wcarss 3 years ago
  
  Thanks for taking the time to look into it, that's an interesting point. According to the manual [1], InnoDB's insert locking should not prevent other inserts from executing, it only takes an exclusive row lock on the inserted row. I agree that measuring would likely be smart.
  This makes some intuitive sense, though: general purpose databases are expected to be _pretty good_ at handling the case of "add new data" with no other specific conditions, e.g. on other rows or tables' existing data.
  I also agree with your last point. Running count() all day on this wouldn't be great, and compaction would take real time. I assumed that most high throughput write scenarios for something like an event count or view count can be a few minutes (or hours, or days) out of date, at which point read caching would be my first stop before clever summation algorithms, which are still pretty cool.
  1 - https://dev.mysql.com/doc/refman/8.0/en/insert.html

idk1 3 years ago

I accidentally ended up at a similar solution once.

I has the same issue, and I fixed it by adding an associated 'count_table' row for each hit, and deleting the row once it had been added later on to the final count. Which actually fixed the issue. Then refactored it so each user or ip had it's own 'count_table' row. It meant the final total count lagged a bit, by 60 seconds or so once the count_table rows had been counted up and deleted, that was the downside but it was totally acceptable.

I wish I'd have thought of this, it's much better and simpler I think haha.

asadawadia 3 years ago

the fundamental problem with counters is that it does a read-modify-write cycle which is quite harsh on the DB - a better approach is to take advantage that counters are cumulative and we can keep the delta events only and 'merge' them on reads

or just buffer the counters in redis and then flush them out

I have a counters API that does precisely this

Docs are here: cmd+f: 'Counters API now live'

https://blog.aawadia.dev/api/

nyanpasu64 3 years ago

This feels like the database locking version of https://travisdowns.github.io/blog/2020/07/06/concurrency-co..., which instead covers in-process concurrency speed.

dima_vm 3 years ago

If we need slotted counters, we're pretty close to the boundary of "right tool for the job", and we need to consider using a time-series database.

For one use-case it might be ok, although if we expect more counters in the future, it's easier to make it now than refactor later.

sonicgear1 3 years ago

Wouldn't querying the count be slow using a WHERE clause?

LesZedCB 3 years ago

it was indexed which should help, but i assume that the reads are far less common than the inserts. reads could even be scheduled/automated and stored in a cache table if they need to be faster and ok being a little stale
Andys 3 years ago

There's only dozens (or at most ~100) rows.

Settings

The Slotted Counter Pattern

Keyboard Shortcuts