MiniMax-M2.5-REAP-139B-A10B-MXFP4

When you want MoE-specific compression without sending quality into the abyss.

Built from:

Quant

Quant	Size (GiB)	Notes
`MXFP4_MOE`	70.91	MoE-oriented quantization, with many non-expert tensors preserved at higher precision

This quant uses a mixed layout (as expected for MXFP4 MoE), including mxfp4, q8_0, and f32 tensors.

Use the first shard; llama.cpp resolves the rest:

llama-cli -m MiniMax-M2.5-REAP-MXFP4_MOE-00001-of-00007.gguf -ngl 0 -c 8192

MXFP4 has a different target audience than standard Q-series packs, so it gets its own clean home and simpler download surface.

You are responsible for your own use, outputs, and compliance with applicable laws and platform policies.

GGUF

Model size

139B params

Architecture

minimax-m2

Hardware compatibility

4-bit

Base model

Quantized

Quantized

Quantized

(2)

this model