Settings

Theme

The Challenge of Large File Checksums

ppppp.dev

8 points by I_like_tomato 3 months ago · 5 comments

Reader

BobbyTables2 3 months ago

I don’t understand the goal here.

Splitting a file into chunks, hashing them in parallel, and then hashing the resulting hashes is certainly a valid method but not the same as hashing a file the traditional way.

Unless the world changes how they publish hashes of files available for download, I don’t see the point.

  • I_like_tomatoOP 3 months ago

    The reasoning here is to improve getting hash of a large file (let say size > 100GB). Reading the file content sequently and hashing it will take a lot longer

    • BobbyTables2 3 months ago

      I agree, but there is no way to compute the equivalent of the sequential hash using any parallel method.

      This isn’t like gzip which can be parallelized.

      Without standardization of a parallelized hash computation, it’s just a toy exercise in an embarrassingly parallel problem.

dabiged 3 months ago

Why not use a faster hashing algorithm like xxhash?

This code is using sha256 which, whilst cryptography secure, is a massive burden for computation.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection