Bringing GPU acceleration to Polars DataFrames in the near future
pola.rsVery cool! I hope the Rust crate gets to take advantage of this. It might just be me, but I feel like the Python package gets all the love from the Polars devs.
I believe it they likely will implement it in rust as an alternative engine to the two existing engines (default and streaming).
This is pretty exciting. I haven't used Polars much, but GPU support could be something that pulls me over to it. I do hope that Polars doesn't just build with Nvidia in mind, hopefully they'll expand to other architectures over time as well
I would have liked to see a more generic implementation that isn't necessarily tied to NVIDIA, while I agree that's a much greater ask than a 7-person team can probably take on, there's a whole cohort of ML and data science folks working on, say, MacOS that are missing out on this optimization.
Interesting, but Rapids only works on:
- Ubuntu 20.04/22.04 or CentOS 7 / Rocky Linux 8 with gcc/++ 9.0+
- Windows 11 using a WSL2 specific install
- RHEL 7/8 support is provided through CentOS 7 / Rocky Linux 8 builds/installs
I'm using it on OpenSUSE and it seems to work fine (I doubt it would be hard to make it work on other unmentioned distros as well). MacOS is obviously irrelevant. Nvidia seems to release their drivers for FreeBSD and Solaris but how many people actually DS/ML on those?
[x] Windows 11 with WSL2, excellent.
..or merely tested and supported on?
Or that it just can't work with anything else?
That's amazing. Hard to believe that Polars is only a 7 person company.
It would be amazing if the code for working with arrow on GPUs could be made open source -- I think that would drive a significant amount of adoption
That's definitely one way to leverage the GPU VRAM hardware inflation intended for LLM model training.
I'm fairly confident that most of the hardware you see available today (for consumers) wasn't specifically designed with LLMs in mind.
Sure, the 8GB VRAM gaming GPUs aren't designed for LLMs (and would effectively get zero benefit from the data throughput of GPU-accelerated data frames compared to typical approaches), but the 80GB A100s server GPUs definitely are.
> but the 80GB A100s server GPUs definitely are
I'm sure LLMs were considered, like many other ML use cases, but that A100 was intended for LLMs? I'm unsure about that.
A100 was released the same year as GPT3, and it wasn't until GPT3 went live that people really started pay attention. Then I'm sure designing and producing a GPU takes a longer time than a couple of months.
Very interesting development :)
Nice. RAPIDS with pandas is pretty fast so looking forward to what’s ahead.
#notAllPlural's
In this case it's because they are Dutch people writing in English.
English education in the Netherlands may not place a strong emphasis on the correct use of apostrophes and plural forms, as these are often considered minor details compared to other aspects of the language, such as vocabulary and grammar. In Dutch, apostrophes are not used for indicating possession or contractions, and plurals are formed by adding -en or -s without an apostrophe. This difference can lead to Dutch speakers incorrectly applying their native language rules when writing in English.
In most cases, Dutch plurals are formed by adding -en or -s to the end of the noun without an apostrophe, e.g., "auto's" (cars), "huizen" (houses), or "boeken" (books).
However, apostrophes are used in Dutch plural forms in specific cases:
- Abbreviations: "cd's" (CDs), "tv's" (TVs), "pk's" (horsepower)
- Letters: "a's en b's" (a's and b's), "x'en" (x's)
- Some loanwords: "pizza's" (pizzas), "taxi's" (taxis), "baby's" (babies)
In some cases, the apostrophe is used to avoid confusion or for readability, especially when the plural form might be misinterpreted, e.g., "foto's" (photos) instead of "fotos" which could be read as a different word.
GPUses