Parakeet-TDT-ExecuTorch-MLX
Pre-exported ExecuTorch .pte file
for Parakeet TDT 0.6B with
the MLX backend on Apple Silicon.
This variant uses:
- MLX delegate
- bf16 activations
- 4-bit weight-only quantization for encoder and decoder linear layers
- group size 128
For the Metal (Apple GPU) variant, see Parakeet-TDT-ExecuTorch-Metal.
Installation
git clone https://github.com/pytorch/executorch/ ~/executorch
cd ~/executorch
make parakeet-mlx
Download
hf download younghan-meta/Parakeet-TDT-ExecuTorch-MLX --local-dir ~/parakeet_mlx
Run
cmake-out/examples/models/parakeet/parakeet_runner \
--model_path ~/parakeet_mlx/model.pte \
--tokenizer_path ~/parakeet_mlx/tokenizer.model \
--audio_path /path/to/audio.wav \
--timestamps none
Optional flags:
--timestamps segmentfor segment timestamps--timestamps wordfor word timestamps--timestamps allfor token, word, and segment timestamps
Export Command
pip install "nemo_toolkit[asr]"
python examples/models/parakeet/export_parakeet_tdt.py \
--backend mlx \
--dtype bf16 \
--qlinear_encoder 4w \
--qlinear_encoder_group_size 128 \
--qlinear 4w \
--qlinear_group_size 128 \
--output-dir ./parakeet_mlx
This export produces:
model.ptetokenizer.model
No separate delegate data blob is required for MLX.
More Info
- Downloads last month
- 4
Model tree for younghan-meta/Parakeet-TDT-ExecuTorch-MLX
Base model
nvidia/parakeet-tdt-0.6b-v3