MiniMax-M2.5-REAP-139B-A10B-MXFP4
When you want MoE-specific compression without sending quality into the abyss.
Built from:
- Base:
MiniMaxAI/MiniMax-M2.5 - REAP source:
tomngdev/MiniMax-M2.5-REAP-139B-A10B-GGUF(BF16 split) - Quantized locally with
llama.cppasMXFP4_MOE.
Quant
| Quant | Size (GiB) | Notes |
|---|---|---|
MXFP4_MOE |
70.91 | MoE-oriented quantization, with many non-expert tensors preserved at higher precision |
Tensor Mix
This quant uses a mixed layout (as expected for MXFP4 MoE), including mxfp4, q8_0, and f32 tensors.
Usage
Use the first shard; llama.cpp resolves the rest:
llama-cli -m MiniMax-M2.5-REAP-MXFP4_MOE-00001-of-00007.gguf -ngl 0 -c 8192
Why a Separate Repo
MXFP4 has a different target audience than standard Q-series packs, so it gets its own clean home and simpler download surface.
Credits
MiniMaxAIfor MiniMax-M2.5tomngdevfor the BF16 REAP GGUF releaseBennyDaBallfor this quant
Disclaimer
You are responsible for your own use, outputs, and compliance with applicable laws and platform policies.
- Downloads last month
- 98
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for BennyDaBall/MiniMax-M2.5-REAP-139B-A10B-MXFP4
Base model
MiniMaxAI/MiniMax-M2.5 Quantized
cerebras/MiniMax-M2.5-REAP-139B-A10B