Gemma-4-21B-A4B-it-REAP (MLX bfloat16)

This is a high-precision conversion of the original model 0xSero/gemma-4-21b-a4b-it-REAP to the MLX format, optimized for Apple Silicon.

Key Features

Architecture: Gemma 4 Multimodal (A4B - Active Blocks)
Optimization: REAP (Reasoning Enhancement and Active Pruning)
Precision: bfloat16 (Original weights accuracy)
Format: MLX (Compatible with mlx-vlm)

Conversion Details

Hardware: Mac Mini (M-Series) with 32GB RAM
Library: mlx-vlm
Dtype: bfloat16

Credits

Original Author: 0xSero
Converted by: Z3NN001

Usage

To use this model with mlx-vlm, you can install the library and run:

pip install mlx-vlm
python -m mlx_vlm.generate --model Z3NN001/gemma-4-21b-a4b-it-REAP-mlx-bfloat16 --prompt "Describe this image" --image <path_to_image>

Downloads last month: 243

Safetensors

Model size

6B params

Tensor type

BF16

U32

MLX

Hardware compatibility

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Z3NN001/gemma4-21b-a4B-it-reap-mlx-BF8

Base model

0xSero/gemma-4-21b-a4b-it-REAP

Quantized

(9)

this model