Configuration Parsing Warning:Invalid JSON for config file config.json

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Kokoro-82M CoreML

3-stage CoreML pipeline for Kokoro-82M text-to-speech, optimized for Apple Neural Engine. Requires iOS 18+ / macOS 15+.

Pipeline

Stage Model Input Output Size
1. Duration duration.mlmodelc Phoneme tokens + voice + speed Durations, prosody features, text encoding 39 MB
2. Prosody prosody.mlmodelc Aligned prosody features + style F0 (pitch) + noise predictions 17 MB
3. Decoder decoder_*.mlmodelc Aligned text + F0 + noise + style 24 kHz audio waveform 107 MB

Swift builds an alignment matrix between stages 1 and 2 from predicted durations.

Decoder Buckets

Bucket Max Frames Max Audio
decoder_5s 200 5.0s
decoder_10s 400 10.0s
decoder_15s 600 15.0s

Voices

54 preset voices across 10 languages: English (US/UK), Spanish, French, Hindi, Italian, Japanese, Korean, Portuguese, Chinese.

Usage

Welcome to Swift!

Subcommands:

swift build Build Swift packages swift package Create and work on packages swift run Run a program from a package swift test Run package tests swift repl Experiment with Swift code interactively

Use swift --version for Swift version information.

Use swift --help for descriptions of available options and flags.

Use swift help <subcommand> for more information about a subcommand.

Conversion

WARNING: Defaulting repo_id to hexgrad/Kokoro-82M. Pass repo_id='hexgrad/Kokoro-82M' to suppress this warning. Loaded Kokoro-82M (81.8M params)

3-Stage Verification: Reference audio: 42000 samples 3-stage audio: 117600 samples Diff: max=0.9962, mean=0.0493 Duration diff: max=12.0 PASS: 3-stage pipeline matches reference

=== Converting Duration Model === Phoneme buckets: [16, 32, 64, 128] Tracing...

License

  • Model weights: Apache-2.0 (hexgrad/Kokoro-82M)
  • CoreML conversion + Swift inference: Apache-2.0
  • Dictionaries and G2P: Apache-2.0

Downloads last month
2,498
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including aufklarer/Kokoro-82M-CoreML