Instructions to use VoiceScribe/voicescribe-parakeet-nvidia with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use VoiceScribe/voicescribe-parakeet-nvidia with NeMo:
# tag did not correspond to a valid NeMo domain.
- Notebooks
- Google Colab
- Kaggle
Voice Scribe mirror parakeet_nvidia from goodsmileduck/parakeet-tdt-0.6b-v3-onnx@cd3de0d7a01b
Browse files- .gitattributes +1 -0
- README.md +54 -0
- UPSTREAM_SOURCE.md +50 -0
- config.json +5 -0
- decoder_joint-model.onnx +3 -0
- encoder-model.onnx +3 -0
- encoder-model.onnx.data +3 -0
- nemo128.onnx +3 -0
- vocab.txt +0 -0
- voicescribe-model-layout.json +33 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
encoder-model.onnx.data filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,54 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
tags:
|
| 3 |
+
- onnx
|
| 4 |
+
- openvino
|
| 5 |
+
- speech-recognition
|
| 6 |
+
- npu
|
| 7 |
+
- parakeet
|
| 8 |
+
- nvidia
|
| 9 |
+
- nemo
|
| 10 |
+
language: en
|
| 11 |
+
license: apache-2.0
|
| 12 |
+
base_model: nvidia/parakeet-tdt-0.6b-v3
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# Parakeet TDT 0.6B v3 — ONNX (NPU-ready)
|
| 16 |
+
|
| 17 |
+
ONNX export of [nvidia/parakeet-tdt-0.6b-v3](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3) for use with OpenVINO on Intel NPU.
|
| 18 |
+
|
| 19 |
+
Includes the bundled NeMo mel spectrogram preprocessor (\) for a self-contained pipeline.
|
| 20 |
+
|
| 21 |
+
## Files
|
| 22 |
+
|
| 23 |
+
| File | Size | Description |
|
| 24 |
+
|------|------|-------------|
|
| 25 |
+
| \ + \ | ~2.5 GB | Conformer encoder (runs on NPU) |
|
| 26 |
+
| \ | 73 MB | TDT joint decoder (runs on CPU) |
|
| 27 |
+
| \ | 141 KB | Mel spectrogram preprocessor (onnxruntime CPU) |
|
| 28 |
+
| \ | 94 KB | 8193-token vocabulary |
|
| 29 |
+
| \ | 97 B | Model metadata |
|
| 30 |
+
|
| 31 |
+
## Pipeline
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
## Performance (Intel Core Ultra / Meteor Lake NPU)
|
| 36 |
+
|
| 37 |
+
| Metric | Value |
|
| 38 |
+
|--------|-------|
|
| 39 |
+
| Load time (cached) | 3.6s |
|
| 40 |
+
| Transcribe 3s audio | 0.29s (RTF 0.095) |
|
| 41 |
+
| WER (LibriSpeech test-clean) | 3.7% |
|
| 42 |
+
| Max audio length | ~16s (MEL_FRAMES=1600) |
|
| 43 |
+
|
| 44 |
+
## Usage
|
| 45 |
+
|
| 46 |
+
Used by [npu-whisper](https://github.com/goodsmileduck/npu-whisper) dictation engine:
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
## Credits
|
| 51 |
+
|
| 52 |
+
- Original model: [nvidia/parakeet-tdt-0.6b-v3](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3)
|
| 53 |
+
- ONNX export by: [istupakov/parakeet-tdt-0.6b-v3-onnx](https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx)
|
| 54 |
+
- Preprocessor from: [onnx-asr](https://pypi.org/project/onnx-asr/) package
|
UPSTREAM_SOURCE.md
ADDED
|
@@ -0,0 +1,50 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Voice Scribe Model Mirror
|
| 2 |
+
|
| 3 |
+
This repository is a Voice Scribe distribution mirror. The model artifacts are
|
| 4 |
+
copied from the upstream repository and the source revision below is pinned.
|
| 5 |
+
|
| 6 |
+
| Field | Value |
|
| 7 |
+
| --- | --- |
|
| 8 |
+
| Layout key | `parakeet_nvidia` |
|
| 9 |
+
| Target directory in installer | `parakeet-v3-onnx` |
|
| 10 |
+
| Upstream repo | `goodsmileduck/parakeet-tdt-0.6b-v3-onnx` |
|
| 11 |
+
| Upstream revision | `cd3de0d7a01b8981c51ce17a4667a2177f6e09d6` |
|
| 12 |
+
| Upstream resolved SHA | `cd3de0d7a01b8981c51ce17a4667a2177f6e09d6` |
|
| 13 |
+
| Mirror created | `2026-04-23T22:30:27Z` |
|
| 14 |
+
| Description | Parakeet-TDT 0.6B v3 ONNX NVIDIA layout. |
|
| 15 |
+
| License metadata | `{"license": "apache-2.0", "license_files": [], "license_tags": ["license:apache-2.0"]}` |
|
| 16 |
+
|
| 17 |
+
## Installer Contract
|
| 18 |
+
|
| 19 |
+
This mirror corresponds to `parakeet/installer/wrapper/model_catalog.py`.
|
| 20 |
+
Required files for installer validation:
|
| 21 |
+
|
| 22 |
+
```json
|
| 23 |
+
[
|
| 24 |
+
"config.json",
|
| 25 |
+
"vocab.txt",
|
| 26 |
+
"nemo128.onnx",
|
| 27 |
+
"encoder-model.onnx",
|
| 28 |
+
"encoder-model.onnx.data",
|
| 29 |
+
"decoder_joint-model.onnx"
|
| 30 |
+
]
|
| 31 |
+
```
|
| 32 |
+
|
| 33 |
+
Allowed installer subset patterns:
|
| 34 |
+
|
| 35 |
+
```json
|
| 36 |
+
[
|
| 37 |
+
"config.json",
|
| 38 |
+
"vocab.txt",
|
| 39 |
+
"nemo128.onnx",
|
| 40 |
+
"encoder-model.onnx",
|
| 41 |
+
"encoder-model.onnx.data",
|
| 42 |
+
"decoder_joint-model.onnx"
|
| 43 |
+
]
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
## Redistribution Note
|
| 47 |
+
|
| 48 |
+
Do not make this repository public unless the upstream license and model card
|
| 49 |
+
allow redistribution for the intended use. Private mirrors are for operational
|
| 50 |
+
distribution convenience and reproducible installs.
|
config.json
ADDED
|
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"model_type": "nemo-conformer-tdt",
|
| 3 |
+
"features_size": 128,
|
| 4 |
+
"subsampling_factor": 8
|
| 5 |
+
}
|
decoder_joint-model.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e978ddf6688527182c10fde2eb4b83068421648985ef23f7a86be732be8706c1
|
| 3 |
+
size 72520893
|
encoder-model.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:98a74b21b4cc0017c1e7030319a4a96f4a9506e50f0708f3a516d02a77c96bb1
|
| 3 |
+
size 41770866
|
encoder-model.onnx.data
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9a22d372c51455c34f13405da2520baefb7125bd16981397561423ed32d24f36
|
| 3 |
+
size 2435420160
|
nemo128.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:701e0b083b96ad0880b051b95ec5a34d08f62032e7a613112b79410d20e29e0f
|
| 3 |
+
size 141206
|
vocab.txt
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
voicescribe-model-layout.json
ADDED
|
@@ -0,0 +1,33 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"schema_version": 1,
|
| 3 |
+
"generated_at": "2026-04-23T22:30:27Z",
|
| 4 |
+
"layout_key": "parakeet_nvidia",
|
| 5 |
+
"target_dir": "parakeet-v3-onnx",
|
| 6 |
+
"upstream_repo": "goodsmileduck/parakeet-tdt-0.6b-v3-onnx",
|
| 7 |
+
"upstream_revision": "cd3de0d7a01b8981c51ce17a4667a2177f6e09d6",
|
| 8 |
+
"upstream_sha": "cd3de0d7a01b8981c51ce17a4667a2177f6e09d6",
|
| 9 |
+
"description": "Parakeet-TDT 0.6B v3 ONNX NVIDIA layout.",
|
| 10 |
+
"required_files": [
|
| 11 |
+
"config.json",
|
| 12 |
+
"vocab.txt",
|
| 13 |
+
"nemo128.onnx",
|
| 14 |
+
"encoder-model.onnx",
|
| 15 |
+
"encoder-model.onnx.data",
|
| 16 |
+
"decoder_joint-model.onnx"
|
| 17 |
+
],
|
| 18 |
+
"allow_patterns": [
|
| 19 |
+
"config.json",
|
| 20 |
+
"vocab.txt",
|
| 21 |
+
"nemo128.onnx",
|
| 22 |
+
"encoder-model.onnx",
|
| 23 |
+
"encoder-model.onnx.data",
|
| 24 |
+
"decoder_joint-model.onnx"
|
| 25 |
+
],
|
| 26 |
+
"license_metadata": {
|
| 27 |
+
"license": "apache-2.0",
|
| 28 |
+
"license_tags": [
|
| 29 |
+
"license:apache-2.0"
|
| 30 |
+
],
|
| 31 |
+
"license_files": []
|
| 32 |
+
}
|
| 33 |
+
}
|