StefanDedalus/parakeet-tdt-0.6b-v3-coreml

This repository is a Core ML derivative distribution of NVIDIA's nvidia/parakeet-tdt-0.6b-v3. It repackages the original multilingual Parakeet TDT model into a split Core ML layout designed for Apple-platform runtimes that load separate encoder, predictor, and joint networks.

This is not an official NVIDIA release. It is a converted and packaged fork/derivative prepared for Core ML deployment.

Published by Damian Tawrel (StefanDedalus).

What This Model Is

The upstream model is a 600M-parameter multilingual automatic speech recognition model for speech-to-text transcription. According to the original NVIDIA model card, it supports automatic language detection, punctuation and capitalization, and word-level plus segment-level timestamps across 25 European languages.

Supported languages:

Bulgarian (bg), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Greek (el), Hungarian (hu), Italian (it), Latvian (lv), Lithuanian (lt), Maltese (mt), Polish (pl), Portuguese (pt), Romanian (ro), Slovak (sk), Slovenian (sl), Spanish (es), Swedish (sv), Russian (ru), Ukrainian (uk)

What This Repository Contains

This repository contains the following artifacts:

  • encoder.mlpackage
  • predictor.mlpackage
  • joint.mlpackage
  • parakeet-coreml-manifest.json
  • optional manifest.json

The split architecture is intended for runtimes that decode the model in stages:

  • The encoder consumes raw audio samples and produces acoustic features.
  • The predictor consumes previous token state and produces predictor-step features.
  • The joint network merges encoder and predictor features into token logits.

The current package metadata describes:

  • sample rate: 16000 Hz
  • encoder input name: audio_samples
  • encoder length input name: not used by this static-shape export
  • encoder fixed sample count: 32000
  • encoder output name: encoder_features
  • predictor token input name: token_id
  • joint output name: joint_logits

Provenance

  • Upstream base model: nvidia/parakeet-tdt-0.6b-v3
  • Original author and publisher: NVIDIA
  • Original model family: Parakeet TDT / NeMo ASR
  • This repository: Core ML conversion and packaging of the upstream release

The original NVIDIA model card remains the authoritative source for training data, evaluation methodology, benchmark numbers, safety notes, and intended use constraints.

License

The upstream model is released by NVIDIA under CC BY 4.0. This derivative Core ML packaging is therefore distributed under the same attribution-oriented license unless a stricter downstream policy is applied by the publisher.

When redistributing or using this package, keep attribution to:

  • NVIDIA as the original model provider
  • the upstream model repository nvidia/parakeet-tdt-0.6b-v3
  • this repository as the Core ML conversion/package maintainer

Intended Use

This repository is intended for:

  • on-device transcription on Apple platforms
  • embedded ASR runtimes that use Core ML
  • offline or hybrid speech-to-text systems that need a local split-TDT package
  • research, prototyping, and product integration where a Core ML package is preferable to a NeMo checkpoint

It is not meant to replace the upstream repository for:

  • training or fine-tuning
  • NeMo-native inference
  • authoritative benchmark interpretation

Limitations

  • This repository is a conversion/package layer, not a retrained model.
  • Accuracy, latency, and memory usage depend on the host runtime, decoder implementation, chunking strategy, quantization settings, and device class.
  • Core ML conversion can introduce runtime-specific constraints that do not exist in the original NeMo release.
  • The package should be validated on target devices before production use.

Recommended Runtime Layout

A host runtime should place the files together in a single directory, for example:

ParakeetCoreML/
  encoder.mlpackage
  predictor.mlpackage
  joint.mlpackage
  parakeet-coreml-manifest.json
  manifest.json

For Apple apps, a common location is:

Application Support/ParakeetCoreML/

The application should create that directory if it does not already exist.

Acknowledgements

This package builds on the original NVIDIA Parakeet TDT release and the NeMo ecosystem. Please cite and attribute the upstream model and associated research when appropriate.

Downloads last month
27
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for StefanDedalus/parakeet-tdt-0.6b-v3-coreml

Quantized
(23)
this model