majentik
/

MERaLiON-2-3B-MLX-4bit

Automatic Speech Recognition

Model card Files Files and versions

MERaLiON-2-3B-MLX-4bit

4-bit quantized MLX version of MERaLiON-2-3B for Apple Silicon.

Quantization Details

Method: MLX affine quantization
Bits: 4
Group size: 64
Components quantized: Decoder (Gemma2-2B) only
Components kept in full precision: Whisper-Large-V3 encoder, multi-modal adaptor

Size Comparison

Component	Original	Quantized
Decoder (Gemma2-2B)	4.9 GB	1.4 GB
Encoder (Whisper-Large-V3)	1.2 GB	1.2 GB
Adaptor	419 MB	419 MB
Total	6.5 GB	3.0 GB

Usage

Structure

- Whisper-Large-V3 encoder (full precision)
- Multi-modal adaptor (full precision)
- Gemma2-2B decoder (4-bit quantized)
- Decoder directory with config, tokenizer, and symlinks to decoder shards

Downloads last month: 34

MLX

Hardware compatibility

Log In to add your hardware

Quantized

Model tree for majentik/MERaLiON-2-3B-MLX-4bit

Base model

google/gemma-2-9b

Finetuned

google/gemma-2-9b-it

Finetuned

MERaLiON/MERaLiON-2-3B

Finetuned

(3)

this model