Settings

Theme

Don't have a $5k MacBook to run LLAMA65B? MiniLLM runs LLMs on GPUs in <500 LOC

github.com

3 points by volodia 3 years ago · 2 comments

Reader

tempaccount420 3 years ago

Doesn't this use as much VRAM as llama.cpp (with int4 models) uses RAM? RAM is a lot cheaper than VRAM.

  • volodiaOP 3 years ago

    It won't run as fast on your CPU at it will run on a GPU. Also, it might clog most of your RAM; it's better to offload to a cheap GPU.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection