onnx-community/parakeet-ctc-0.6b-ONNX

about timestamps

by altunenes - opened Oct 15, 2025

Oct 15, 2025

Hi, I tested the model and it works really fast compared with whisper and only CPU! Thank you for your contribution!
CoreML gave me an error, but it's CPU is already working faster than Whisper Metal on Mac:

Error: Ort(Error { code: GenericFailure, msg: "Non-zero status code returned while running 
12615810092392341640_CoreML_12615810092392341640_3 node. 
Name:'CoreMLExecutionProvider_12615810092392341640_CoreML_12615810092392341640_3_3' Status Message: Error executing model: 
Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model 
(error code: -1)." })"

My question is, can we use word/segment timestamps as in the original? Or is this ONNX export not currently supported this feature?

altunenes

Oct 15, 2025

ops, sorry I thought you converted that model 😀 :

https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3

But it seems you converted that one, which I believe does not support the timestamps

https://huggingface.co/nvidia/parakeet-ctc-0.6b

altunenes changed discussion status to closed Oct 15, 2025

Xenova

ONNX Community org Oct 15, 2025

You should actually be able to get pretty good token-level timestamps with CTC models. The model outputs token probabilities for each timestamp in the sequence (of shape [batch_size, sequence_length, vocab_size]), so simply based on the position of each token, you can predict the relative timestamp based on the number of frames and the length of the audio.

altunenes

Oct 16, 2025

You should actually be able to get pretty good token-level timestamps with CTC models. The model outputs token probabilities for each timestamp in the sequence (of shape [batch_size, sequence_length, vocab_size]), so simply based on the position of each token, you can predict the relative timestamp based on the number of frames and the length of the audio.

thank you for guidance! It worked! 🙃
https://github.com/altunenes/parakeet-rs

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment