Other Languages than English / French in premade voices
#36
by markwitt1 - opened
I can only find english or french prebuilt voices.
If i use these voices with german text it has a heavy accent.
Sure, I can clone my own voice in german and it works, but the quality will never be the same as a first-party prebuilt voice.
Any recommendations here? any change we will get prebuilt voices in the other languages?
We're working on a new set of voices that will include German and other European languages. In the meantime, we are updating our documentation on how to select a voice prompt for emulation. Here's a few guidelines to help you in the meantime:
- Voice prompt needs to be chosen carefully:
- Ideally longer than 3 seconds, up to 30 seconds should be good enough.
- Must have only one speaker
- No background noise, clean recording for best results.
- Neutral prosody of speech: no excessive pausing/disfluencies such as umm/ahh,
- Should have expressive pitch: flat voice samples lead to boring generations.
- Ideally should be same language as the text prompt, but model should work on cross-lingual prompts too. For example, you could try voice prompt in language A (French), translated text in language B (English), and the generation would sound like French-accented English.
- Text prompt
- Ideally convert to verbalizable form for controllability (e.g. instead of “1234”, use “one thousand two thirty four” or “twelve hundred thirty four” to disambiguate
- No use of rich formatting like markdown, emojis, etc.
- For abbreviations, while model might work out of the box, better results likely obtained by changing “FBI” —> “F-B-I” or “F.B.I.”.
- Prompts should be less than 300 words or lesser.
- One text prompt per request. Multiple lines are not expected to work well