GigaAM v3 e2e CTC โ€” MLX

MLX port of GigaAM-v3 for fast Russian speech recognition on Apple Silicon. 180x realtime on M2 Max.

Usage

pip install gigaam-mlx
from gigaam_mlx import load_model, transcribe

model, tokenizer = load_model()  # downloads weights automatically
text = transcribe(model, tokenizer, "recording.wav")
print(text)

Or via CLI:

gigaam-mlx recording.wav

Performance

MacBook Pro M2 Max, 20-second chunk:

Backend Time Realtime
MLX CTC (this) 0.11s 180x
PyTorch MPS RNNT 0.76s 26x
ONNX CPU CTC 1.66s 12x

Model

  • Architecture: Conformer (16 layers, 768d, 16 heads, RoPE) + CTC
  • Parameters: 220M
  • Vocabulary: 257 tokens (SentencePiece)
  • Features: Punctuation, text normalization, Russian + English code-switching

Links

Downloads last month
65
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aystream/GigaAM-v3-e2e-ctc-mlx

Finetuned
(2)
this model

Paper for aystream/GigaAM-v3-e2e-ctc-mlx

Evaluation results