Settings

Theme

Show HN: 6cy – Experimental streaming archive format with per-block codecs

github.com

24 points by yihac1 4 hours ago · 8 comments · 1 min read

Reader

Hi HN,

I’ve been experimenting with archive format design and built 6cy as a research project.

The goal is not to replace zip/7z, but to explore: • block-level codec polymorphism (different compression per block) • streaming-first layout (no global seek required) • better crash recovery characteristics • plugin-based architecture so proprietary codecs can exist without changing the format

Right now this is an experimental v0.x format. The specification may still change and compatibility is not guaranteed yet.

I’m mainly looking for feedback on the format design rather than performance comparisons.

Thanks for taking a look.

fwip 2 hours ago

Looks pretty cool. After checking out spec.md, one thing I might suggest is using UUIDs for external codec ids. There's a lot of codecs/formats out there, and limiting to a u8 might lead to collisions.

It might also be nice to provide a mechanism to advertise the required codecs toward the beginning of the stream, in case the consumer does not have the necessary codecs and wishes to abort the transfer.

  • yihac1OP 2 hours ago

    Good catch — you’re absolutely right to call that out.

    The current u8 codec ID is mainly there to keep the block header very small and fast to parse, but it’s not meant to be the global identifier. The idea is to map that ID to something globally unique (most likely a UUID) through the plugin/manifest layer, so we can avoid collisions without bloating the on-disk format.

    I also like the suggestion about advertising required codecs early in the stream. That would make it much nicer for a reader to fail fast if it doesn’t support something, especially for streaming use cases. We’re exploring adding a small capability section near the beginning for exactly that reason.

    Since the format is still experimental, this kind of feedback is really helpful before we lock things down.

    • 22c an hour ago

      FWIW I think most users here would prefer you reply to their hand-written comments using your own words, even if you had to use a translator.

      • yihac1OP an hour ago

        Thank you for your message. English is not my native language, so I sometimes use translation tools. I will try my best to reply to you in more direct and understandable language. Thank you for your patience.

    • wongarsu an hour ago

      If you want to use smaller ids within the stream, that capability section would seem to be the natural place to map from global codec-uuid to file-local u8 identifier

      • yihac1OP an hour ago

        Yes, this mapping method makes a lot of sense, I'll give it a try.

itsthecourier 3 hours ago

sounds great, may you please share some benchmarks/experiments?

  • yihac1OP 2 hours ago

    We did run internal benchmarks and experiments, but we focused on measuring our own performance rather than doing side-by-side comparisons with other tools. The goal was to validate stability, speed, and resource usage in real scenarios, not to create a public comparison that could be misleading or depend on many external factors. We’re planning to share our methodology and raw numbers so people can evaluate the results themselves and run their own comparisons if they want.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection