Settings

Theme

Ask HN: Pandas to Polars migration, from 200s timeouts to under 4s. Anyone else?

4 points by Franco-m 16 days ago · 1 comment · 1 min read


I built a small project that cleans CSV files for ML and data analysis. Using pandas in the backend, cleaning a 65 MB file took over 200 seconds and frequently timed out. Just parsing the upload took over 120 seconds. After migrating to Polars and adding in-memory DataFrame caching between the upload and cleaning endpoints, the same file now cleans in under 4 seconds. Smaller files feel nearly instant. Has anyone else made this migration in a production app? Curious about edge cases you hit, especially with type inference, null handling, and lazy vs eager evaluation.

ShawnCCS 14 days ago

Next time, U can try chDB (Clickhouse in-process version). It support 100% pandas compatible with super OLAP power based on Clickhouse. Official link: https://clickhouse.com/chdb

Disclosure: I work for ClickHouse.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection