Latvian model for F5 TTS

For usage instructions see the F5-TTS repo and use F5-TTS Base as base model.

Sample audios for voice reference should be about 6-12 seconds long. Longer clips should be clipped to avoid automatic clipping in the middle of a word. If you get errors in generated audio check thet the automatic transcript of the reference audio is correct and adjust if needed.

Known ussues:

  • Handling of "o" character. "O" is some times pronounced as ⟨o⟩ like in "omīte" and some times as ⟨uo⟩ like in "ola"

Model is trained on Mozilla Common Voice 24.0 dataset.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RaivisDejus/F5-TTS-Latvian

Base model

SWivid/F5-TTS
Finetuned
(88)
this model