Gemma-4-21B-A4B-it-REAP (MLX bfloat16)
This is a high-precision conversion of the original model 0xSero/gemma-4-21b-a4b-it-REAP to the MLX format, optimized for Apple Silicon.
Key Features
- Architecture: Gemma 4 Multimodal (A4B - Active Blocks)
- Optimization: REAP (Reasoning Enhancement and Active Pruning)
- Precision:
bfloat16(Original weights accuracy) - Format: MLX (Compatible with
mlx-vlm)
Conversion Details
- Hardware: Mac Mini (M-Series) with 32GB RAM
- Library:
mlx-vlm - Dtype:
bfloat16
Credits
Usage
To use this model with mlx-vlm, you can install the library and run:
pip install mlx-vlm
python -m mlx_vlm.generate --model Z3NN001/gemma-4-21b-a4b-it-REAP-mlx-bfloat16 --prompt "Describe this image" --image <path_to_image>
- Downloads last month
- 243
Model size
6B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
8-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Z3NN001/gemma4-21b-a4B-it-reap-mlx-BF8
Base model
0xSero/gemma-4-21b-a4b-it-REAP