MLX Speech Models
Collection
Speech AI models for Apple Silicon via MLX. ASR, TTS, VAD, diarization, speaker embedding. • 29 items • Updated • 1
MLX 8-bit quantized conversion of Qwen/Qwen3-TTS-12Hz-0.6B-Base for Apple Silicon inference.
Used by speech-swift Qwen3TTS module:
let model = try await Qwen3TTSModel.fromPretrained(
modelId: "aufklarer/Qwen3-TTS-12Hz-0.6B-Base-MLX-8bit"
)
let audio = try model.synthesize("Hello, world!")
audio speak "Hello, world!" --model base-8bit -o output.wav
| Variant | Quantization | Size | Model ID |
|---|---|---|---|
| 0.6B 4-bit | 4-bit | ~981 MB | aufklarer/Qwen3-TTS-12Hz-0.6B-Base-MLX-4bit |
| 0.6B 8-bit | 8-bit | ~1.3 GB | aufklarer/Qwen3-TTS-12Hz-0.6B-Base-MLX-8bit |
| 1.7B 4-bit | 4-bit | ~1.7 GB | aufklarer/Qwen3-TTS-12Hz-1.7B-Base-MLX-4bit |
| 1.7B 8-bit | 8-bit | ~2.8 GB | aufklarer/Qwen3-TTS-12Hz-1.7B-Base-MLX-8bit |
8-bit
Base model
Qwen/Qwen3-TTS-12Hz-0.6B-Base