IndicVoice
A decoder-only neural TTS model for Indian languages, built on the Kokoro-82M architecture with a native Indic G2P frontend.
Author: Kushal Kant Bind — Chandigarh University Project: GSoC 2026, Sugar Labs GitHub: Bindkushal/indic-voice
Quick Start
pip install git+https://github.com/Bindkushal/indic-g2p.git
pip install git+https://github.com/Bindkushal/indic-voice.git
apt-get install espeak-ng
from indicvoice import IndicPipeline
pipeline = IndicPipeline(lang_code="hi", repo_id="Bindkushal/IndicVoice-82M")
for gs, ps, audio in pipeline("नमस्ते दुनिया", voice="af_heart"):
import soundfile as sf
sf.write("output.wav", audio, 24000)
Files in This Repo
| File | Description |
|---|---|
config.json |
Model architecture config |
indicvoice-v1_0.pth |
Model weights (82M params) |
voices/af_heart.pt |
Default voice style tensor |
voices/af_bella.pt |
Voice style tensor |
voices/am_adam.pt |
Voice style tensor |
Supported Languages
| Language | Code | Script | Status |
|---|---|---|---|
| Hindi | hi |
Devanagari | Ready |
| Punjabi | pa |
Gurmukhi | Ready |
| Bengali | bn |
Bengali | Beta |
| English | en |
Roman | Ready |
Architecture
- Base: Kokoro-82M (StyleTTS2 + ISTFTNet), Apache 2.0
- G2P: indic-g2p — native Indic phonemizer
- Fallback: espeak-ng for OOV words
- Parameters: 82M
- Sample rate: 24000 Hz
Citation
@misc{bind2026indicvoice,
title={IndicVoice: Decoder-Only Neural TTS with Native G2P for Indian Languages},
author={Kushal Kant Bind},
year={2026},
institution={Chandigarh University},
note={GSoC 2026, Sugar Labs}
}
Acknowledgements
- hexgrad/kokoro — base TTS architecture (Apache 2.0)
- AI4Bharat — IndicVoices-R dataset
- IIT Madras — IndicTTS dataset
License
Apache 2.0
- Downloads last month
- 171