Settings

Theme

Show HN: KV-psi, using Linux PSI to to trim an LLM KV cache

github.com

8 points by infiniteregrets 2 days ago · 0 comments · 1 min read

Reader

I thought it'd be interesting to use Linux PSI (Pressure Stall Information) for an LLM runtime to trim the KV cache. This is mainly useful imo for edge devices like the Jetson Orin super nano kit which have unified memory. I haven't benched much, but plan to do so more over time and see if I can make a real use of it as I run local LLMs. Let me know if it makes sense :P (I of course vibed this idea)

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection