sherpa-onnx-whisper-distil-large-v3-it

Italian-distilled Whisper large-v3 model exported to ONNX format for sherpa-onnx.

Description

This is an int8-quantized ONNX export of bofenghuang/whisper-large-v3-distil-it-v0.2, an Italian-specific distillation of OpenAI's Whisper large-v3 model.

The distillation reduces the decoder from 32 layers to 2 layers, making it significantly smaller and faster while maintaining strong Italian transcription quality.

Model details

Property	Value
Base model	bofenghuang/whisper-large-v3-distil-it-v0.2
Language	Italian
Decoder layers	2 (distilled from 32)
n_mels	128
n_audio_ctx	1500
n_text_ctx	448
n_text_state	1280
Vocab size	51866
Quantization	int8 (QInt8 on MatMul)
License	Apache 2.0

Files

File	Size
`distil-large-v3-it-encoder.int8.onnx`	~637 MB
`distil-large-v3-it-decoder.int8.onnx`	~300 MB
`distil-large-v3-it-tokens.txt`	~798 KB

Usage with sherpa-onnx

This model is designed for use with the sherpa-onnx runtime. Download all three files to the same directory and configure your sherpa-onnx pipeline to use:

Encoder: distil-large-v3-it-encoder.int8.onnx
Decoder: distil-large-v3-it-decoder.int8.onnx
Tokens: distil-large-v3-it-tokens.txt

Export

Exported from the original model using a script based on k2-fsa/sherpa-onnx/scripts/whisper/export-onnx.py.

Acknowledgments

Original Whisper model by OpenAI
Italian distillation by bofenghuang
ONNX runtime by sherpa-onnx

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support