Ask HN: What kind of local on-device AI do you find useful?
Something that fits in 12GB VRAM or less. I've been making a point'n'click game recently, and generating the art using Flux.1 Dev and Flux.Konnect locally on a Mac Mini M1 with 8GB of RAM. It isn't quick (20m+ per image) but once I had the settings dialled in for the style I want it works really well. Very neat use! Do you have anything public currently? Curious to see how they look. Or if you can't share at the moment, what's the art style you're going for? I have an RTX 3060 with 12GB VRAM. For simpler questions like "how do I change the modified date of a file in Linux", I use Qwen 14B Q4_K_M. It fits entirely in VRAM. If 14B doesn't answer correctly, I switch to Qwen 32B Q3_K_S, which will be slower because it needs to use the RAM. I haven't tried yet the 30B-A3B which I hear is faster and closer to 32B. BTW, I run these models with llama.cpp. For image generation, Flux and Qwen Image work with ComfyUI. I also use Nunchaku, which improves speed considerably. I don't know because I have 36GB memory on Apple Silicon and mostly use models that require around 32GB, but I will say that people underestimate the abilities of ~7b models for many tasks. What ~7b models would you recommend investigating? Start with the 7b and 8b models popular on Ollama's listings: https://ollama.com/search?q=7b https://ollama.com/search?q=8b It's not showing up high on the list of the above URLs for some reason, but I quite enjoy the deepseek-r1 8b variant: https://ollama.com/library/deepseek-r1 YMMV Auto-summarization and OCR. It probably would be mildly good at spelling/grammar correction but I'd need a tight integration and not just a chat program. ocring and labeling my photo library Could you elaborate a bit on your tools / workflow? thanks