sherpa-onnx-whisper-distil-large-v3-it

Italian-distilled Whisper large-v3 model exported to ONNX format for sherpa-onnx.

Description

This is an int8-quantized ONNX export of bofenghuang/whisper-large-v3-distil-it-v0.2, an Italian-specific distillation of OpenAI's Whisper large-v3 model.

The distillation reduces the decoder from 32 layers to 2 layers, making it significantly smaller and faster while maintaining strong Italian transcription quality.

Model details

Property Value
Base model bofenghuang/whisper-large-v3-distil-it-v0.2
Language Italian
Decoder layers 2 (distilled from 32)
n_mels 128
n_audio_ctx 1500
n_text_ctx 448
n_text_state 1280
Vocab size 51866
Quantization int8 (QInt8 on MatMul)
License Apache 2.0

Files

File Size
distil-large-v3-it-encoder.int8.onnx ~637 MB
distil-large-v3-it-decoder.int8.onnx ~300 MB
distil-large-v3-it-tokens.txt ~798 KB

Usage with sherpa-onnx

This model is designed for use with the sherpa-onnx runtime. Download all three files to the same directory and configure your sherpa-onnx pipeline to use:

  • Encoder: distil-large-v3-it-encoder.int8.onnx
  • Decoder: distil-large-v3-it-decoder.int8.onnx
  • Tokens: distil-large-v3-it-tokens.txt

Export

Exported from the original model using a script based on k2-fsa/sherpa-onnx/scripts/whisper/export-onnx.py.

Acknowledgments

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support