Gemma-4-21B-A4B-it-REAP (MLX bfloat16)

This is a high-precision conversion of the original model 0xSero/gemma-4-21b-a4b-it-REAP to the MLX format, optimized for Apple Silicon.

Key Features

  • Architecture: Gemma 4 Multimodal (A4B - Active Blocks)
  • Optimization: REAP (Reasoning Enhancement and Active Pruning)
  • Precision: bfloat16 (Original weights accuracy)
  • Format: MLX (Compatible with mlx-vlm)

Conversion Details

  • Hardware: Mac Mini (M-Series) with 32GB RAM
  • Library: mlx-vlm
  • Dtype: bfloat16

Credits

Usage

To use this model with mlx-vlm, you can install the library and run:

pip install mlx-vlm
python -m mlx_vlm.generate --model Z3NN001/gemma-4-21b-a4b-it-REAP-mlx-bfloat16 --prompt "Describe this image" --image <path_to_image>
Downloads last month
243
Safetensors
Model size
6B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Z3NN001/gemma4-21b-a4B-it-reap-mlx-BF8

Quantized
(9)
this model