Scaling Rich Style-Prompted Text-to-Speech Datasets
Paper • 2503.04713 • Published • 1
The ParaSpeechCaps dataset and models trained on it
Note ParaSpeechCaps (Paralinguistic Speech Captions), a large-scale dataset that annotates speech utterances with rich style captions.
Note Parler-TTS-Mini-v1, a style-prompted TTS model, finetuned on ParaSpeechCaps.
Note Parler-TTS-Mini-v1, a style-prompted TTS model, finetuned on ParaSpeechCaps-Base.
Gradio Demo for ParaSpeechCaps