SynthVision: Building a 110K Synthetic Medical VQA Dataset

3 points by maziyar 3 months ago · 1 comment

Reader

maziyarOP 3 months ago

We annotated 119K medical images with two frontier VLMs (Qwen 3.5, Kimi K2.5), cross-validated at 93% agreement, and produced 110K training records, all for under $500. Fine-tuning 3 small models (2-3B params) improved all benchmarks: best model reaches +15.0% average exact match. Everything is open-sourced: datasets, adapters, and code.

Settings

SynthVision: Building a 110K Synthetic Medical VQA Dataset

Keyboard Shortcuts