Settings

Theme

Show HN: Distilled 0.6B text-to-SQL model

github.com

5 points by maciejgryka 16 days ago · 0 comments · 1 min read

Reader

We used our platform to fine-tune a tiny text-to-SQL model using distillation from DeepSeek V3. Repo has instructions for how to replicate this.

This is definitely not the best-performing model like this out there! But I found it surprising we were able to get to this much out of it: stone's throw away from a teacher 1000x the size!

We also ran the same thing using the 4B Qwen and matched the teacher accuracy, though here the difference is merely 100x :)

I find this pretty cool - obviously our distilled models can only do this one task and don't generalize, but that's often exactly what you want when you're building agentic systems.

Happy to answer any questions!

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection