YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Kokoro-82M TTS Model
High-quality text-to-speech model with 82M parameters and 24 voices (American/British English).
Model Variants
| Variant | File | Size | Quality | Recommended For |
|---|---|---|---|---|
| FP32 | kokoro-v1.0.onnx |
310MB | Baseline | Development/reference |
| FP16 | kokoro-v1.0.fp16.onnx |
169MB | Near-identical | Default for deployment |
| INT8 | kokoro-v1.0.int8.onnx |
88MB | Slight degradation | Mobile/edge devices |
FP16 is the default - best balance of quality and size (45% smaller than FP32).
Download Model Files
# FP16 model (recommended default - 169MB)
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.fp16.onnx
# Voice embeddings (required)
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/voices-v1.0.bin
mv voices-v1.0.bin voices.bin
# Misaki dictionaries (for MisakiDictionary backend - default)
mkdir -p misaki
wget -O misaki/us_gold.json https://raw.githubusercontent.com/hexgrad/misaki/refs/heads/main/misaki/resources/en/us_gold.json
wget -O misaki/us_silver.json https://raw.githubusercontent.com/hexgrad/misaki/refs/heads/main/misaki/resources/en/us_silver.json
# Optional: INT8 for mobile (88MB, 72% smaller)
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.int8.onnx
# Optional: FP32 for reference (310MB)
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx
Switching Variants
To use a different variant, update model_metadata.json:
{
"execution_template": {
"type": "SimpleMode",
"model_file": "kokoro-v1.0.fp16.onnx" // or .int8.onnx or .onnx
}
}
Phonemization Backends
Kokoro requires text-to-phoneme conversion before audio synthesis. Xybrid supports multiple backends:
MisakiDictionary (Default)
Zero external dependencies - uses dictionary lookup for phonemization.
{
"preprocessing": [{
"type": "Phonemize",
"backend": "MisakiDictionary",
"tokens_file": "tokens.txt"
}]
}
Pros:
- No system dependencies - works on mobile/embedded
- Fast dictionary lookup
- Self-contained deployment
Cons:
- May not handle unusual words/names perfectly
- Falls back to basic phonemization for out-of-vocabulary words
Required files (included):
misaki/us_gold.json(2.9MB) - High-confidence dictionarymisaki/us_silver.json(3.0MB) - Extended vocabulary
EspeakNG (Alternative)
Uses the espeak-ng system command for phonemization.
{
"preprocessing": [{
"type": "Phonemize",
"backend": "EspeakNG",
"language": "en-us",
"tokens_file": "tokens.txt"
}]
}
Pros:
- Higher quality phonemization
- Better handling of unusual words, numbers, abbreviations
Cons:
- Requires espeak-ng installed on the system
- Not suitable for mobile/embedded deployment
Installation:
# macOS
brew install espeak-ng
# Ubuntu/Debian
apt-get install espeak-ng
# Windows
# Download from https://github.com/espeak-ng/espeak-ng/releases
Switching Backends
Update model_metadata.json to change the phonemization backend:
{
"preprocessing": [{
"type": "Phonemize",
"backend": "MisakiDictionary", // or "EspeakNG"
"tokens_file": "tokens.txt"
}]
}
Available Voices
| Voice | Description |
|---|---|
| af_bella | American Female - Bella |
| af_nicole | American Female - Nicole |
| af_sarah | American Female - Sarah |
| af_sky | American Female - Sky |
| am_adam | American Male - Adam |
| am_michael | American Male - Michael |
| bf_emma | British Female - Emma |
| bf_isabella | British Female - Isabella |
| bm_george | British Male - George |
| bm_lewis | British Male - Lewis |
Voice naming convention: {region}{gender}_{name}
- Region:
a= American,b= British - Gender:
f= Female,m= Male
Usage
Run through xybrid execution system:
use xybrid_core::template_executor::TemplateExecutor;
let executor = TemplateExecutor::from_metadata_file("test_models/kokoro-82m/model_metadata.json")?;
let audio = executor.run_text("Hello, world!")?;
License
Apache-2.0
Source
- Model: Kokoro-82M-v1.0-ONNX
- ONNX conversion: thewh1teagle/kokoro-onnx
- Misaki G2P: hexgrad/misaki
- Rust reference: lucasjinreal/Kokoros