Settings

Theme

An MVCC-like columnar table on S3 with constant-time deletes

shayon.dev

45 points by shayonj 3 months ago · 9 comments

Reader

simonw 3 months ago

This is a really clever design.

The cost estimates are particularly notable: if they're right that's a cost of about $3/day for 6TB/day of written data, 2TB/day of deletes and 50K read queries.

Storing all those TBs of data in S3 is where the real cost lies. I think it costs $5520 to store 8TB*30 = 240TB in S3, and if you retain all data your monthly cost goes up by $5520 every month.

  • shayonjOP 3 months ago

    Here is another take on deletes through by just updating the row groups in Parquet file through multi part upload and UploadPartCopy - https://www.shayon.dev/post/2025/285/mutable-atomic-deletes-...

  • xyzzy_plugh 3 months ago

    I think the idea is that the deletes would eventually be compacted, so it's ultimately half as much, but I digress.

    The cost isn't that bad all things considered. Hot, durable and available data ain't that cheap, especially in the cloud. Self-hosting is within an order of magnitude.

    • shayonjOP 3 months ago

      I think ideally you could map retention of cold data to file objects itself and using key space naming strategy and lifecycle rules, expire the data that is not needed, thus saving on the storage costs (as much as possible hopefully)

simlevesque 3 months ago

I just want to be able to append metadata to a Parquet file at the end without rewriting the whole file. Tombstones could be baked in the parquet file this way.

It does work with "one more file" but it's not good for performance.

  • cpard 3 months ago

    That’s the whole reason of existence of Iceberg, Delta and Hudi right?

    Not as easy as just appending metadata to a parquet file but in the other hand, parquet was never and probably shouldn’t be designed with that functionality in mind.

  • shayonjOP 3 months ago

    Yeah. Or just sub out the data with null bytes. Something like that could be nice too.

    • simlevesque 3 months ago

      Are you familiar with Parquet ? you can't do that at all, you need to rewrite the whole file.

      • shayonjOP 3 months ago

        Yeah , I poorly phrased it - I meant in an ideal situation with the benefits of parquet like columnar file structure. I very much understand that it’s not possible on parquet today for the reasons you mentioned and others.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection