Settings

Theme

An MVCC-like columnar table on S3 with constant-time deletes

shayon.dev

20 points by shayonj 3 months ago · 10 comments

Reader

DenisM 3 months ago

I still don’t see how this is different from iceberg? You don’t need a catalog to use it, atomic replace of metadata.json plus deletion vectors seems to be exactly the same thing.

codedokode 3 months ago

S3 is a HTTP API, does it mean that this database would be very slow? Especially if they use immutability and create copies of large files?

  • shayonjOP 3 months ago

    Yeah, it mentions in few areas - compared to OLTP or similar workloads, this will definitely be slow

ohnoesjmr 3 months ago

The sequence diagram seems to have a mistake, the second writer somehow seems to know to create v124, only having observed v122.

  • deepsun 3 months ago

    Fun fact -- try to search for "124" there.

    For some reason they thought hard-positioned top-to-bottom SVG is somehow better than adding "white-space: pre" once in CSS ¯\_(ツ)_/¯

akdor1154 3 months ago

I know Iceberg has this same issue, but you state deletion in this way (recording tombstones) is sufficient for GDPR compliance - but is it really? The 'deleted' data is still trivially readable.

  • hodgesrm 3 months ago

    It's OK provided there's a garbage collection procedure. But the write-up seems to regard this as optional.

    > Deletes accumulate in tombstone files over time. Eventually we would want to coalesce 100 small tombstone files into one and /or rewrite data files if a row group has >50% rows deleted, resulting in further compaction.

    The bigger problem for me is that tombstones that remove rows can make reads quite inefficient because they reduce the usefulness of min-max and bloom filter indexes. It can also affect vectorized query if you have to apply predicates within row groups. Finally there are degenerate cases where the tombstones would be bigger than the compressed columns themselves.

    Any assertion that this would be performant needs to be backed up by code. ClickHouse took many years to implement so-called lightweight deletes. It's a hard problem to solve in a performant way.

jerrysievert 3 months ago

given that it’s parquet, deletes are nice, but what about inserts?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection