Settings

Theme

The Accuracy of On-Device LLMs

medium.com

2 points by aazo11 7 months ago · 2 comments

Reader

aazo11OP 7 months ago

I tested on-device LMs (Gemma, DeepSeek) across prompt cleanup, PII redaction, math, and general knowledge on my M2 Max laptop using LM Studio + DSPy.

Some observations

- Gemma-3 is the best model for on-device inference - 1B models look fine at first but break under benchmarking - 4B can handle simple rewriting and PII redaction. It also did math reasoning surprisingly well. - General knowledge Q&A does not work with a local model. This might work with a RAG pipeline or additional tools

I plan on training and fine-tuning 1B models to see if I can build high accuracy task specific models under 1GB in the future.

  • billconan 7 months ago

    hope to see your results of their new gemma 3n and benchmarks on multi-modal inputs.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection