Bringing GPU acceleration to Polars DataFrames in the near future

80 points by orwellg1984 2 years ago · 19 comments

Reader

p4ul 2 years ago

Very cool! I hope the Rust crate gets to take advantage of this. It might just be me, but I feel like the Python package gets all the love from the Polars devs.

theLiminator 2 years ago

I believe it they likely will implement it in rust as an alternative engine to the two existing engines (default and streaming).

jairuhme 2 years ago

This is pretty exciting. I haven't used Polars much, but GPU support could be something that pulls me over to it. I do hope that Polars doesn't just build with Nvidia in mind, hopefully they'll expand to other architectures over time as well

rajman187 2 years ago

I would have liked to see a more generic implementation that isn't necessarily tied to NVIDIA, while I agree that's a much greater ask than a 7-person team can probably take on, there's a whole cohort of ML and data science folks working on, say, MacOS that are missing out on this optimization.

dr_kiszonka 2 years ago

Interesting, but Rapids only works on:

- Ubuntu 20.04/22.04 or CentOS 7 / Rocky Linux 8 with gcc/++ 9.0+

- Windows 11 using a WSL2 specific install

- RHEL 7/8 support is provided through CentOS 7 / Rocky Linux 8 builds/installs

https://docs.rapids.ai/install

Wytwwww 2 years ago

I'm using it on OpenSUSE and it seems to work fine (I doubt it would be hard to make it work on other unmentioned distros as well). MacOS is obviously irrelevant. Nvidia seems to release their drivers for FreeBSD and Solaris but how many people actually DS/ML on those?
KingOfCoders 2 years ago

[x] Windows 11 with WSL2, excellent.
_flux 2 years ago

..or merely tested and supported on?
Or that it just can't work with anything else?

sa-code 2 years ago

That's amazing. Hard to believe that Polars is only a 7 person company.

alamb 2 years ago

It would be amazing if the code for working with arrow on GPUs could be made open source -- I think that would drive a significant amount of adoption

minimaxir 2 years ago

That's definitely one way to leverage the GPU VRAM hardware inflation intended for LLM model training.

CaptainOfCoit 2 years ago

I'm fairly confident that most of the hardware you see available today (for consumers) wasn't specifically designed with LLMs in mind.
- minimaxir 2 years ago
  
  Sure, the 8GB VRAM gaming GPUs aren't designed for LLMs (and would effectively get zero benefit from the data throughput of GPU-accelerated data frames compared to typical approaches), but the 80GB A100s server GPUs definitely are.
  - CaptainOfCoit 2 years ago
    
    > but the 80GB A100s server GPUs definitely are
    I'm sure LLMs were considered, like many other ML use cases, but that A100 was intended for LLMs? I'm unsure about that.
    A100 was released the same year as GPT3, and it wasn't until GPT3 went live that people really started pay attention. Then I'm sure designing and producing a GPU takes a longer time than a couple of months.

hashjoiner 2 years ago

Very interesting development :)

lvl102 2 years ago

Nice. RAPIDS with pandas is pretty fast so looking forward to what’s ahead.

wiredfool 2 years ago

#notAllPlural's

greenavocado 2 years ago

In this case it's because they are Dutch people writing in English.
English education in the Netherlands may not place a strong emphasis on the correct use of apostrophes and plural forms, as these are often considered minor details compared to other aspects of the language, such as vocabulary and grammar. In Dutch, apostrophes are not used for indicating possession or contractions, and plurals are formed by adding -en or -s without an apostrophe. This difference can lead to Dutch speakers incorrectly applying their native language rules when writing in English.
In most cases, Dutch plurals are formed by adding -en or -s to the end of the noun without an apostrophe, e.g., "auto's" (cars), "huizen" (houses), or "boeken" (books).
However, apostrophes are used in Dutch plural forms in specific cases:
- Abbreviations: "cd's" (CDs), "tv's" (TVs), "pk's" (horsepower)
- Letters: "a's en b's" (a's and b's), "x'en" (x's)
- Some loanwords: "pizza's" (pizzas), "taxi's" (taxis), "baby's" (babies)
In some cases, the apostrophe is used to avoid confusion or for readability, especially when the plural form might be misinterpreted, e.g., "foto's" (photos) instead of "fotos" which could be read as a different word.
juitpykyk 2 years ago

GPUses

Settings

Bringing GPU acceleration to Polars DataFrames in the near future

Keyboard Shortcuts