Settings

Theme

Training a new voice for Piper TTS with only a single phrase

calbryant.uk

3 points by naggie 5 months ago · 1 comment

Reader

magicalhippo 5 months ago

Author uses Chatterbox TTS' zero-shot voice cloning to generate synthetic training data from a single phrase, Whisper STT to verify the generated voice sample to catch generation errors, and then uses the synthetic data set to fine-tune Piper TTS the standard way.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection