Settings

Theme

Processing Strings 109x Faster Than Nvidia on H100

ashvardanian.com

34 points by samspenc 3 months ago · 3 comments

Reader

ozgrakkurt 3 months ago

Duplicate of https://news.ycombinator.com/item?id=45304807

trilogic 3 months ago

Impressive work anti diagonal DP on CUDA, clean MCUPS framing, and the multi language shipping is legit. The “109× faster than NVIDIA on H100” line is accurate for your chosen case (cuDF/nvtext, long strings), but it’s not a blanket “faster than NVIDIA,” and readers will assume that tighten the scope. Bio results are a good baseline, not SOTA; Hopper’s DPX and WFA style tiling/bucketing would likely move you a tier up. Hashing and 52 bit MinHash are clever, but you need full SMHasher reports and retrieval quality metrics, not just entropy/collisions. Publish exact versions, params, and end to end timings (I/O + marshaling), plus short string vs long string batches. If you add those and rename the headline to reflect the setup, the claims will be hard to poke holes in.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection