Gemma-4-21B-A4B-it-REAP (MLX bfloat16)

This repository contains the full-precision bfloat16 conversion of 0xSero/gemma-4-21b-a4b-it-REAP to the MLX format, specifically optimized for Apple Silicon (M1/M2/M3/M4).

Model Highlights

  • Architecture: Gemma 4 Multimodal with Active Blocks (A4B).
  • Optimization: REAP (Reasoning Enhancement and Active Pruning) for superior vision-language performance.
  • Precision: bfloat16 (No quantization loss, identical to original weights).

Technical Details

The conversion was performed using the mlx-vlm library on local hardware to ensure bit-perfect compatibility with the MLX ecosystem.

  • Hardware: Mac Mini (M-Series) with 32GB RAM.
  • Software: mlx-vlm (latest version).
  • Format: Native MLX safetensors.

Usage

You can run this model using the mlx-vlm library.

Installation

pip install mlx-vlm

Credits
Original Model: 0xSero/gemma-4-21b-a4b-it-REAP

MLX Conversion: Z3NN001


Downloads last month
261
Safetensors
Model size
21B params
Tensor type
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Z3NN001/gemma-4-21b-a4b-it-REAP-mlx-bfloat16

Finetuned
(4)
this model