VocoLoco — OmniVoice ONNX Models

ONNX exports of k2-fsa/OmniVoice for browser-based text-to-speech inference via ONNX Runtime Web.

Models

File	Size	Description
`omnivoice-main-split.onnx` + `_data_00`-`_04`	2.3 GB	Main TTS model (FP32, sharded)
`omnivoice-main-int8.onnx`	586 MB	Main TTS model (INT8 quantized, for mobile/low-memory)
`omnivoice-decoder.onnx`	83 MB	Audio token decoder (tokens to waveform)
`omnivoice-encoder-fixed.onnx`	624 MB	Audio encoder for voice cloning
`tokenizer.json`	11 MB	Qwen2 BPE text tokenizer

These models are designed to run in the browser via VocoLoco, a fully client-side TTS application. No server required.

Apache 2.0 — same as the original OmniVoice model.

Based on OmniVoice by Xiaomi Corp (k2-fsa).

Downloads last month: -; Downloads are not tracked for this model. How to track

Base model

Finetuned

Finetuned

Quantized

(1)

this model