GitHub - juliusgeo/csimdv-rs: SIMD accelerated CSV parsing in Rust

3 min read Original article ↗

An alternate approach to SIMD CSV parsing, heavily inspired by: https://github.com/medialab/simd-csv

Differences

simd-csv is a fantastic library, however, as noted in the README, does not use the "pclmulqdq" trick that many other SIMD based parsers do (most notably simdjson). To be fair, this trick does not work on all targets, which is the stated reason that simd-csv does not use it. However, I wanted to see if I could make a version of simd-csv that did use this trick, and see how much of a performance boost it would give.

Similarities

To make the comparison as fair as possible, I use an API which is very similar to simd-csv's ZeroCopyReader, which does not perform any validation or escaping of the raw CSV data. Thus, the comparison in speed should give a good sense of the speed of the parsing itself, irrespective of validation/string escaping/iterator overhead.

let file = File::open(path).unwrap();
let mut p = Parser::new(default_dialect(), AlignedBuffer::new(file));
while let Some(mut record) = p.read_line() {
    for field in record.iter() {
        let _ = field.len();
    }
}

Performance

On AArch64, the table lookup approach used by simdjson is used because it saves 1 extra comparison between the data and the return character, and the comparisons are quite slow. On x86, just directly comparing the input data and the 4 characters of interests is faster. I initially implemented this using portable_simd, but it results in suboptimal code generation, especially on aarch64, where there is no equivalent to the movemask x86 instruction. I worked around that aspect by loading the data interleaved into NEON vectors, allowing the usage of some more efficient bitmask generation techniques.

The following benchmark results were all calculated using criterion-rs with a flat sampling mode with a sampling time of 100s.

aarch64 NEON

File csimdv simd-csv % Change
EDW.TEST_CAL_DT.csv 2.0234 GiB/s 2.1657 GiB/s -6.5
nfl.csv 2.4118 GiB/s 1.8498 GiB/s 30.4
customers-2000000.csv (not committable, too large) 2.4165 GiB/s 1.7753 GiB/s 36.1

Ran on an Apple M1 Max with 64GB of RAM.

x86_64 AVX-512

File csimdv simd-csv % Change
EDW.TEST_CAL_DT.csv 1.6645 GiB/s 1.9766 GiB/s -15.8
nfl.csv 2.5073 GiB/s 2.0066 GiB/s 24.9
customers-2000000.csv (not committable, too large) 3.6405 GiB/s 1.6402 GiB/s 121.9

x86_64 AVX-2

File csimdv simd-csv % Change
EDW.TEST_CAL_DT.csv 1.7015 GiB/s 2.0572 GiB/s -17.3
nfl.csv 2.5413 GiB/s 2.0658 GiB/s 23.0
customers-2000000.csv (not committable, too large) 3.6090 GiB/s 1.6854 GiB/s 114.1

Ran on an AMD Ryzen 7 9800x3d with 32GB of RAM, with RUSTFLAGS="-C target-cpu=native -C target-feature=-avx512f" for AVX2.