Instructions to use VoiceScribe/gigaam-v3-e2e-rnnt-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use VoiceScribe/gigaam-v3-e2e-rnnt-mlx with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir gigaam-v3-e2e-rnnt-mlx VoiceScribe/gigaam-v3-e2e-rnnt-mlx
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
| library_name: mlx | |
| license: mit | |
| language: | |
| - ru | |
| - en | |
| tags: | |
| - automatic-speech-recognition | |
| - mlx | |
| - apple-silicon | |
| - russian | |
| - gigaam | |
| - conformer | |
| - rnnt | |
| base_model: ai-sage/GigaAM-v3 | |
| pipeline_tag: automatic-speech-recognition | |
| # GigaAM v3 e2e RNNT — MLX | |
| MLX port of [GigaAM-v3](https://github.com/salute-developers/GigaAM) RNNT variant for Apple Silicon. Higher quality than CTC, ~77x realtime on M2 Max. | |
| ## Usage | |
| ```bash | |
| pip install git+https://github.com/aystream/gigaam-mlx.git | |
| ``` | |
| ```python | |
| from gigaam_mlx import load_model, transcribe | |
| model, tokenizer = load_model("rnnt") # downloads automatically | |
| text = transcribe(model, tokenizer, "recording.wav") | |
| ``` | |
| Or via CLI: | |
| ```bash | |
| gigaam-mlx recording.wav --model-type rnnt | |
| ``` | |
| ## CTC vs RNNT | |
| | Variant | Speed (20s chunk) | Quality | Full 18-min video | | |
| |---|---|---|---| | |
| | [CTC](https://huggingface.co/aystream/GigaAM-v3-e2e-ctc-mlx) | 0.06s (~330x) | Good | 21.5s | | |
| | **RNNT (this)** | **0.26s (~77x)** | **Better** | **25.0s** | | |
| ## Links | |
| - **Code:** [github.com/aystream/gigaam-mlx](https://github.com/aystream/gigaam-mlx) | |
| - **CTC variant:** [aystream/GigaAM-v3-e2e-ctc-mlx](https://huggingface.co/aystream/GigaAM-v3-e2e-ctc-mlx) | |
| - **Original:** [salute-developers/GigaAM](https://github.com/salute-developers/GigaAM) ([paper](https://arxiv.org/abs/2506.01192)) | |
| - **License:** MIT | |