Settings

Theme

Show HN: I fine-tuned Qwen 3.5 (0.8B–4B) on a Mac for text-to-SQL – 2B beats 12B

github.com

4 points by sciences44 2 days ago · 2 comments

Reader

sciences44OP 2 days ago

I wanted to test the new Qwen 3.5 Small models (released March 2) for a structured output task. I fine-tuned the 0.8B, 2B and 4B on text-to-SQL using LoRA on a Mac (64 GB, MLX), and added Mistral-Nemo 12B as a baseline.

The 2B beat the 12B by 19 percentage points (50% vs 31% semantic accuracy). Larger models are "too smart"? They compute the answer mentally and output "42" instead of writing SQL. 81% of the 12B's errors were plain numbers.

Everything runs locally, zero cloud compute. The repo has scripts, data and full results to reproduce it.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection