IndicVoice

A decoder-only neural TTS model for Indian languages, built on the Kokoro-82M architecture with a native Indic G2P frontend.

Author: Kushal Kant Bind — Chandigarh University Project: GSoC 2026, Sugar Labs GitHub: Bindkushal/indic-voice


Quick Start

pip install git+https://github.com/Bindkushal/indic-g2p.git
pip install git+https://github.com/Bindkushal/indic-voice.git
apt-get install espeak-ng
from indicvoice import IndicPipeline
pipeline = IndicPipeline(lang_code="hi", repo_id="Bindkushal/IndicVoice-82M")
for gs, ps, audio in pipeline("नमस्ते दुनिया", voice="af_heart"):
    import soundfile as sf
    sf.write("output.wav", audio, 24000)

Files in This Repo

File Description
config.json Model architecture config
indicvoice-v1_0.pth Model weights (82M params)
voices/af_heart.pt Default voice style tensor
voices/af_bella.pt Voice style tensor
voices/am_adam.pt Voice style tensor

Supported Languages

Language Code Script Status
Hindi hi Devanagari Ready
Punjabi pa Gurmukhi Ready
Bengali bn Bengali Beta
English en Roman Ready

Architecture

  • Base: Kokoro-82M (StyleTTS2 + ISTFTNet), Apache 2.0
  • G2P: indic-g2p — native Indic phonemizer
  • Fallback: espeak-ng for OOV words
  • Parameters: 82M
  • Sample rate: 24000 Hz

Citation

@misc{bind2026indicvoice,
  title={IndicVoice: Decoder-Only Neural TTS with Native G2P for Indian Languages},
  author={Kushal Kant Bind},
  year={2026},
  institution={Chandigarh University},
  note={GSoC 2026, Sugar Labs}
}

Acknowledgements

  • hexgrad/kokoro — base TTS architecture (Apache 2.0)
  • AI4Bharat — IndicVoices-R dataset
  • IIT Madras — IndicTTS dataset

License

Apache 2.0

Downloads last month
171
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support