Settings

Theme

Hyparquet.js: World's Smallest and Most Conformant Parquet File Parser

github.com

3 points by platypii a year ago · 1 comment

Reader

platypiiOP a year ago

My goal is to build tools which enable working with large-scale ML datasets in the browser. The browser is critical for building compelling UIs, but previous parquet js libraries had gone abandoned.

Apache Parquet is a very complicated format. It has 22 data types, 9 encodings, 8 compression codecs. However, I can confidently say that Hyparquet is now the most conformant parquet parser in existence. It can open all the parquet files: more than PyArrow and DuckDB. I dare you to find a file that Hyparquet can’t open!

Hyparquet is MIT licensed, and there is a demo github page which can open parquet files in the browser with no backend server.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection