Settings

Theme

Show HN: Fastest(?) SIMD CSV Parser in Rust

github.com

1 points by juliusgeo 2 months ago · 0 comments · 1 min read

Reader

There are already a quite a few [0][1] CSV parsers that use SIMD, some in Rust, with a variety of approaches. I found simd-csv[1] to have a very interesting approach that leverages memchr to essentially "seek" for the next delimiter, reducing a lot of the overhead that a byte-by-byte CSV parser would have. However, as noted in the README, the creators of simd-csv explicitly chose not to use the classic pclmulqdq trick[2] that other libraries like simdjson use due to portability concerns. I set out to beat to simd-csv's implementation by building a parser more similar to Geoff Langdale's, using the pclmulqdq trick as well as optimized intrinsic usage for aarch64 platforms[3]. If anyone has feedback on the Rust code, or my usage of intrinsics, I would greatly appreciate it.

[0] https://github.com/geofflangdale/simdcsv [1] https://github.com/medialab/simd-csv [2] https://branchfree.org/2019/03/06/code-fragment-finding-quot... [3] https://developer.arm.com/community/arm-community-blogs/b/se...

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection