LiteWhisper Large V3 Turbo ONNX
Word-level timestamps via cross-attention.
Models
| Encoder | Decoder | Size |
|---|---|---|
| FP32 | FP32 | 2.1 GB |
| FP16 | quantized | 1.2 GB (recommended) |
Usage
const transcriber = await pipeline(
"automatic-speech-recognition",
"ipsilondev/lite-whisper-large-v3-turbo-onnx",
{ dtype: { encoder_model: "fp16", decoder_model_merged: "quantized" } }
);
const result = await transcriber(audio, { return_timestamps: "word" });
- Downloads last month
- 2