Breeze-ASR-25 CoreML INT8

INT8 weight-quantized CoreML version of MediaTek-Research/Breeze-ASR-25 for on-device speech recognition on Apple Silicon.

Model Files

Method: Linear symmetric INT8 weight quantization via coremltools.optimize.coreml.linear_quantize_weights
Storage precision: Mixed (Float16, Int8)
Original model size: ~2.9 GB (Float16)
Compression ratio: ~2x

Base: Whisper-large-v2 fine-tuned for Taiwanese Mandarin + English code-switching
Encoder input: logmel_data (1 x 80 x 3000)
Encoder output: output (1 x 1500 x 1280)
Decoder input: token_data (1 x 1) + audio_data (1 x 1500 x 1280)
Decoder output: logits (1 x 1 x 51865)
Spec version: 7 (iOS 16+ / macOS 13+)

Uses whisper.cpp naming convention:

Converted using the sheep52031/breeze-asr-25-coreml-ane conversion toolchain with a custom HuggingFace decoder wrapper.

python convert-whisper-to-coreml_int8_support.py \
  --model MediaTek-Research/Breeze-ASR-25 \
  --hf-model \
  --quantize-int8

Apache 2.0 (same as the original Breeze-ASR-25 model).

Downloads last month: -; Downloads are not tracked for this model. How to track

Base model

Finetuned

Finetuned

(15)

this model