GigaAM: Efficient Self-Supervised Learner for Speech Recognition
Paper โข 2506.01192 โข Published
MLX port of GigaAM-v3 for fast Russian speech recognition on Apple Silicon. 180x realtime on M2 Max.
pip install gigaam-mlx
from gigaam_mlx import load_model, transcribe
model, tokenizer = load_model() # downloads weights automatically
text = transcribe(model, tokenizer, "recording.wav")
print(text)
Or via CLI:
gigaam-mlx recording.wav
MacBook Pro M2 Max, 20-second chunk:
| Backend | Time | Realtime |
|---|---|---|
| MLX CTC (this) | 0.11s | 180x |
| PyTorch MPS RNNT | 0.76s | 26x |
| ONNX CPU CTC | 1.66s | 12x |
Quantized
Base model
ai-sage/GigaAM-v3