Kokoro-82M TTS Model

High-quality text-to-speech model with 82M parameters and 24 voices (American/British English).

Model Variants

Variant	File	Size	Quality	Recommended For
FP32	`kokoro-v1.0.onnx`	310MB	Baseline	Development/reference
FP16	`kokoro-v1.0.fp16.onnx`	169MB	Near-identical	Default for deployment
INT8	`kokoro-v1.0.int8.onnx`	88MB	Slight degradation	Mobile/edge devices

FP16 is the default - best balance of quality and size (45% smaller than FP32).

Download Model Files

# FP16 model (recommended default - 169MB)
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.fp16.onnx

# Voice embeddings (required)
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/voices-v1.0.bin
mv voices-v1.0.bin voices.bin

# Misaki dictionaries (for MisakiDictionary backend - default)
mkdir -p misaki
wget -O misaki/us_gold.json https://raw.githubusercontent.com/hexgrad/misaki/refs/heads/main/misaki/resources/en/us_gold.json
wget -O misaki/us_silver.json https://raw.githubusercontent.com/hexgrad/misaki/refs/heads/main/misaki/resources/en/us_silver.json

# Optional: INT8 for mobile (88MB, 72% smaller)
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.int8.onnx

# Optional: FP32 for reference (310MB)
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx

Switching Variants

To use a different variant, update model_metadata.json:

{
  "execution_template": {
    "type": "SimpleMode",
    "model_file": "kokoro-v1.0.fp16.onnx"  // or .int8.onnx or .onnx
  }
}

Phonemization Backends

Kokoro requires text-to-phoneme conversion before audio synthesis. Xybrid supports multiple backends:

MisakiDictionary (Default)

Zero external dependencies - uses dictionary lookup for phonemization.

{
  "preprocessing": [{
    "type": "Phonemize",
    "backend": "MisakiDictionary",
    "tokens_file": "tokens.txt"
  }]
}

Pros:

No system dependencies - works on mobile/embedded
Fast dictionary lookup
Self-contained deployment

Cons:

May not handle unusual words/names perfectly
Falls back to basic phonemization for out-of-vocabulary words

Required files (included):

misaki/us_gold.json (2.9MB) - High-confidence dictionary
misaki/us_silver.json (3.0MB) - Extended vocabulary

EspeakNG (Alternative)

Uses the espeak-ng system command for phonemization.

{
  "preprocessing": [{
    "type": "Phonemize",
    "backend": "EspeakNG",
    "language": "en-us",
    "tokens_file": "tokens.txt"
  }]
}

Pros:

Higher quality phonemization
Better handling of unusual words, numbers, abbreviations

Cons:

Requires espeak-ng installed on the system
Not suitable for mobile/embedded deployment

Installation:

# macOS
brew install espeak-ng

# Ubuntu/Debian
apt-get install espeak-ng

# Windows
# Download from https://github.com/espeak-ng/espeak-ng/releases

Switching Backends

Update model_metadata.json to change the phonemization backend:

{
  "preprocessing": [{
    "type": "Phonemize",
    "backend": "MisakiDictionary",  // or "EspeakNG"
    "tokens_file": "tokens.txt"
  }]
}

Available Voices

Voice	Description
af_bella	American Female - Bella
af_nicole	American Female - Nicole
af_sarah	American Female - Sarah
af_sky	American Female - Sky
am_adam	American Male - Adam
am_michael	American Male - Michael
bf_emma	British Female - Emma
bf_isabella	British Female - Isabella
bm_george	British Male - George
bm_lewis	British Male - Lewis

Voice naming convention: {region}{gender}_{name}

Region: a = American, b = British
Gender: f = Female, m = Male

Usage

Run through xybrid execution system:

use xybrid_core::template_executor::TemplateExecutor;

let executor = TemplateExecutor::from_metadata_file("test_models/kokoro-82m/model_metadata.json")?;
let audio = executor.run_text("Hello, world!")?;

License

Apache-2.0

Source

Model: Kokoro-82M-v1.0-ONNX
ONNX conversion: thewh1teagle/kokoro-onnx
Misaki G2P: hexgrad/misaki
Rust reference: lucasjinreal/Kokoros

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support