Parakeet TDT 0.6B v3 β MLX
MLX safetensors conversion of nvidia/parakeet-tdt-0.6b-v3 for Apple Silicon.
Architecture
- Encoder: Conformer (1024 hidden, pre-encoding convolutions + transformer layers)
- Decoder: TDT Transducer (predictor LSTM + joint network, 5 duration classes: 0-4)
- Vocabulary: 1025 tokens (SentencePiece)
- Parameters: ~0.6B
- Audio input: 16 kHz mono, 128 mel bins
Contents
| File | Description |
|---|---|
model.safetensors |
All weights (encoder + predictor + joint), float32, ~2.3 GB |
config.json |
Full NeMo model configuration |
tokenizer.model |
SentencePiece tokenizer |
tokenizer.vocab |
Tokenizer vocabulary |
vocab.txt |
Text vocabulary |
Notes
- Weights converted from NeMo PyTorch format to MLX safetensors
- Convolution weights use MLX layout (OHWI for 2D, OKI for 1D) β not directly compatible with PyTorch
- CTC head, preprocessor, spec augmentation, and loss weights are excluded (inference only)
License
CC-BY-4.0 β following the upstream nvidia/parakeet-tdt-0.6b-v3 license. Attribution to NVIDIA is required.
Source
Converted from nvidia/parakeet-tdt-0.6b-v3.
- Downloads last month
- 48
Hardware compatibility
Log In to add your hardware
Quantized
Model tree for NeoRoth/parakeet-tdt-0.6b-v3-mlx
Base model
nvidia/parakeet-tdt-0.6b-v3