metadata
title: XTTS Voice Studio
emoji: 🎙️
colorFrom: yellow
colorTo: yellow
sdk: docker
app_port: 7860
pinned: false
XTTS v2 Voice Studio
A multilingual text-to-speech studio powered by Coqui XTTS v2, served via FastAPI (no Gradio).
Features
- 16 supported languages (Arabic, English, French, …)
- Voice cloning from uploaded audio samples
- Persistent voice library
- Generation history with playback & download
- Fully custom React UI
Usage
- Upload a reference audio clip (WAV / MP3 / FLAC, ≥ 6 s recommended).
- Type your text and pick a language.
- Adjust advanced parameters if needed.
- Click ⚡ توليد الصوت and wait — CPU inference takes ~30–90 s per request.
Notes
- Running on CPU; generation is slower than GPU but fully functional.
- The XTTS v2 model (~1.8 GB) is downloaded on first startup and cached.
COQUI_TOS_AGREED=1is set automatically — by using this Space you agree to the Coqui TTS terms.