bozdemir Karma 96 Created 4 years ago Recent Submissions 1. ▲ Q8 KV cache lets a 30B model fit 100K context on a 24 GB RTX 5090 (buraak.com) 3 points · 1 month ago · 0 comments All submissions on HN · View profile on HN