Show HN: Parkiet – Fine-tune a large TTS model for any language under $100
github.comA lot of the open-source TTS models are released for English or Chinese and lack support for other languages. I was curious to see if I could train a state-of-the-art text-to-speech (TTS) model for Dutch by using Google's free TPU Research credits. The results are fantastic and on-par with ElevenLabs with just 10,000 hours of data.
I open-sourced the weights, and documented the whole journey, from Torch model conversion, data preparation, JAX training code and inference pipeline. I spent about $300 in egress costs, but it can be as cheap as $100 to train this model (I ran the data collection pipeline on my 5090 Desktop PC as well as fine-tuning Whisper).
Hopefully it can serve as a guide for others that are curious to train these models for other languages (without burning through all the credits trying to fix the pipeline).
No comments yet.