Parakeet TDT 110M — Web/SafeTensors Export

Browser-optimized SafeTensors export of nvidia/parakeet-tdt_ctc-110m.

Model Details

Property	Value
Base model	nvidia/parakeet-tdt_ctc-110m
Architecture	FastConformer-TDT+CTC hybrid (17 layers, d_model=512)
Parameters	~110M
Decoder	Token-and-Duration Transducer (TDT) — 2-5× faster than RNNT
Language	English
Weights format	SafeTensors, float16 (~220 MB)
Vocab size	1025 tokens (SentencePiece BPE)
Mel bands	80
TDT durations	[0, 1, 2, 3, 4]
Context	Full attention [-1, -1] — offline/batch mode

Files

model.safetensors — all weights in float16
model_config.json — architecture hyperparameters
vocab.json — token ID → text mapping

Usage with audio-ml

const base = 'https://huggingface.co/AbijahKaj/parakeet-tdt-110m-web/resolve/main';
const config = await fetch(`${base}/model_config.json`).then(r => r.text());
const vocab = await fetch(`${base}/vocab.json`).then(r => r.text());
const weights = await fetch(`${base}/model.safetensors`).then(r => r.arrayBuffer());
await recognizer.loadFromBuffers(weights, config, vocab);

Export Process

Converted from the original NeMo checkpoint using:

python tools/export_nemo_to_safetensors.py \
    --model nvidia/parakeet-tdt_ctc-110m \
    --output-dir exported/parakeet-tdt-110m

Attribution

This is a format conversion (NeMo → SafeTensors fp16) of NVIDIA's original model. No fine-tuning or weight modification was performed. All credit for the model architecture and training goes to NVIDIA. See the original model card for full details, benchmarks, and license terms.

License: CC-BY-4.0 (inherited from the original model)

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for AbijahKaj/parakeet-tdt-110m-web

Base model

nvidia/parakeet-tdt_ctc-110m

Finetuned

(4)

this model