Parakeet TDT 0.6B v3 β€” MLX

MLX safetensors conversion of nvidia/parakeet-tdt-0.6b-v3 for Apple Silicon.

Architecture

  • Encoder: Conformer (1024 hidden, pre-encoding convolutions + transformer layers)
  • Decoder: TDT Transducer (predictor LSTM + joint network, 5 duration classes: 0-4)
  • Vocabulary: 1025 tokens (SentencePiece)
  • Parameters: ~0.6B
  • Audio input: 16 kHz mono, 128 mel bins

Contents

File Description
model.safetensors All weights (encoder + predictor + joint), float32, ~2.3 GB
config.json Full NeMo model configuration
tokenizer.model SentencePiece tokenizer
tokenizer.vocab Tokenizer vocabulary
vocab.txt Text vocabulary

Notes

  • Weights converted from NeMo PyTorch format to MLX safetensors
  • Convolution weights use MLX layout (OHWI for 2D, OKI for 1D) β€” not directly compatible with PyTorch
  • CTC head, preprocessor, spec augmentation, and loss weights are excluded (inference only)

License

CC-BY-4.0 β€” following the upstream nvidia/parakeet-tdt-0.6b-v3 license. Attribution to NVIDIA is required.

Source

Converted from nvidia/parakeet-tdt-0.6b-v3.

Downloads last month
48
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for NeoRoth/parakeet-tdt-0.6b-v3-mlx

Finetuned
(35)
this model