Text-to-Speech
Transformers
ONNX
speech-synthesis
multilingual
indic
orpheus
quantized
low-latency
zero-shot
emotions
discrete-audio-tokens
onnxruntime-genai
Instructions to use Prince-1/svara-tts-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Prince-1/svara-tts-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="Prince-1/svara-tts-v1")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Prince-1/svara-tts-v1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| base_model: kenpath/svara-tts-v1 | |
| license: apache-2.0 | |
| language: | |
| - hi # Hindi | |
| - bn # Bengali | |
| - mr # Marathi | |
| - te # Telugu | |
| - kn # Kannada | |
| - bho # Bhojpuri | |
| - mag # Magahi | |
| - hne # Chhattisgarhi | |
| - mai # Maithili | |
| - as # Assamese | |
| - brx # Bodo | |
| - doi # Dogri | |
| - gu # Gujarati | |
| - ml # Malayalam | |
| - pa # Punjabi | |
| - ta # Tamil | |
| - ne # Nepali | |
| - sa # Sanskrit | |
| - en # English (Indian) | |
| tags: | |
| - text-to-speech | |
| - speech-synthesis | |
| - transformers | |
| - multilingual | |
| - indic | |
| - orpheus | |
| - quantized | |
| - low-latency | |
| - zero-shot | |
| - emotions | |
| - discrete-audio-tokens | |
| - onnx | |
| - onnxruntime-genai | |
| task_categories: | |
| - text-to-speech | |
| pipeline_tag: text-to-speech | |
| pretty_name: Svara-TTS v1 | |
| datasets: | |
| - SYSPIN | |
| - RASA | |
| - IndicTTS | |
| - SPICOR | |
| # svara-TTS v1 — Open Multilingual TTS for India’s Voices | |
| [](https://huggingface.co/kenpath/svara-tts-v1) | |
| [](https://huggingface.co/spaces/kenpath/svara-tts) | |
| [](https://colab.research.google.com/drive/15YxFo1DzdQNbFUIZ1HJA4AN4oHqKxGtg) | |
| [](https://github.com/Kenpath/svara-tts-inference) | |
| **svara-TTS** is a developer-first multilingual TTS model for **19 languages** (18 Indic + Indian English). | |
| Built on an Orpheus-style discrete audio token approach, it targets **clarity, expressiveness, and low-latency** on commodity GPUs/CPUs. | |
| It supports light-weight **emotion/style control** (e.g., `<happy>`, `<sad>`, `<anger>`, `<fear>`) and simple **speaker identities** (`Language (Gender)`), with **zero-shot** adaptation paths. | |
| --- | |
| ## At a Glance | |
| - **Languages (19):** Hindi, Bengali, Marathi, Telugu, Kannada, Bhojpuri, Magahi, Chhattisgarhi, Maithili, Assamese, Bodo, Dogri, Gujarati, Malayalam, Punjabi, Tamil, Nepali, Sanskrit, Indian English. | |
| - **Expressivity:** End-of-utterance style tags; natural prosody; code-switch aware. | |
| - **Latency & Deployment:** Works well with **GGUF** exports; suitable for edge/CPU scenarios. | |
| - **Adaptability:** **LoRA-friendly** for quick speaker/domain specialization. | |
| Try it live on the [**Demo Space**](https://huggingface.co/spaces/kenpath/svara-tts), or on **[Colab](https://colab.research.google.com/drive/15YxFo1DzdQNbFUIZ1HJA4AN4oHqKxGtg)** | |
| Deployment scripts and **inference repo** will be available soon. Watch our [Github](https://github.com/Kenpath/svara-tts-inference) for updates | |
| --- | |
| ## Prompting (Orpheus-style) | |
| - Place style/emotion tags **at the end** of the sentence: | |
| `आज... सच में अच्छी खबर है — शाम को मिलते हैं! <happy>` | |
| - Use punctuation to hint prosody (ellipses, commas, exclamation). | |
| - For technical or dense text, end with `<clear>` to prioritize intelligibility. | |
| > Speaker IDs follow a simple convention: **`Language (Gender)`** (e.g., `Marathi (Male)`). | |
| --- | |
| ## Training Data Summary | |
| Trained on **2000+ hours** of open, high-quality speech from **SYSPIN**, **RASA**, **IndicTTS**, and **SPICOR**, covering **~50 speakers** (balanced male/female) across **19 languages**. | |
| Data was curated to encourage natural prosody, broad coverage, and stable multilingual transfer. See **Acknowledgments** for provenance. | |
| --- | |
| ## Intended Uses | |
| - Multilingual assistants, IVR, learning apps, reading aids, accessibility tools | |
| - Content localization (education, public-information, civic services) | |
| - Research on Indic prosody, emotion control, cross-lingual transfer | |
| ## Out-of-Scope / Not Intended | |
| - Impersonation of private individuals or public figures without consent | |
| - Deceptive content (fraud, harassment, misinformation) | |
| - Safety-critical deployments without human oversight | |
| --- | |
| ## Limitations | |
| - **Proper nouns & rare entities:** may require spelling hints or `<clear>`. | |
| - **Very long sentences:** chunk or add punctuation for natural prosody. | |
| - **Emotion strength:** varies by language due to data density. | |
| - **Code-mixing:** common patterns work; it’s not a deterministic rules engine. | |
| Many of these improve with targeted LoRA finetuning and better preprocessing. | |
| --- | |
| ## Responsible Use | |
| By using this model, you agree to follow applicable laws and ethical guidelines. | |
| Avoid impersonation, harassment, targeted deception, or other harmful uses. | |
| Where appropriate, disclose synthetic speech to end users. | |
| --- | |
| ## Sources & Links | |
| - **Model:** https://huggingface.co/kenpath/svara-tts-v1 | |
| - **Demo Space:** https://huggingface.co/spaces/kenpath/svara-tts | |
| - **Inference repo:** https://github.com/Kenpath/svara-tts-inference | |
| - **Colab:** https://colab.research.google.com/drive/15YxFo1DzdQNbFUIZ1HJA4AN4oHqKxGtg | |
| --- | |
| ## 🙏 Acknowledgments | |
| This work was developed by **[Kenpath Technologies](https://kenpath.ai/)** for the open-source community. We also thank RunPod for the startup credits that supported our GPU compute. | |
| - **Canopy Labs — Orpheus:** foundational ideas & open release | |
| Release: https://canopylabs.ai/releases/orpheus_can_speak_any_language | |
| - **SPIRE Lab, IISc Bangalore** — **SYSPIN** (multilingual studio) and **SPICOR** (Indian English) | |
| - **AI4Bharat** — **RASA** expressive speech | |
| - **IIT Madras** — **IndicTTS** | |
| - **Unsloth** — helpful notes & tooling | |
| - **RunPod** — startup GPU credits that accelerated experiments | |
| --- | |
| ## License | |
| **Apache-2.0** | |
| --- | |
| ## Versioning & Changelog | |
| - **v1.0.0:** Initial public release (19 languages) |