Breeze-ASR-25 CoreML (4-bit Palette Quantized)

This repository contains the Apple CoreML version of MediaTek-Research/Breeze-ASR-25, quantized to 4-bit using palette (lookup table) quantization for efficient on-device inference.

Model Details

Property Value
Base Model MediaTek-Research/Breeze-ASR-25
Architecture Whisper (large-v2 based)
Format Apple CoreML (.mlmodelc)
Quantization 4-bit palette
Languages Chinese (zh), English (en)

Files

β”œβ”€β”€ AudioEncoder.mlmodelc/    # Mel spectrogram β†’ encoder hidden states
β”œβ”€β”€ MelSpectrogram.mlmodelc/  # Audio waveform β†’ Mel spectrogram
β”œβ”€β”€ TextDecoder.mlmodelc/     # Encoder states β†’ token predictions
β”œβ”€β”€ config.json               # Model configuration
└── generation_config.json    # Generation/decoding parameters

Usage with WhisperKit

This model is designed to run with WhisperKit on Apple devices (iPhone, iPad, Mac).

import WhisperKit

let pipe = try await WhisperKit(
    model: "weiren119/Breeze-ASR-25-coreml-4bit-palette"
)
let result = try await pipe.transcribe(audioPath: "audio.wav")
print(result.text)

Quantization Details

4-bit palette quantization compresses model weights by mapping them to a lookup table of 16 representative values (2⁴ = 16). This significantly reduces model size while maintaining reasonable accuracy, making it suitable for on-device deployment where memory is constrained.

Downloads last month
353
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for weiren119/Breeze-ASR-25-coreml-4bit-palette

Finetuned
(15)
this model