MiniMax-M2.7 AWQ 4bit (W4A16)

W4A16 quantization of MiniMaxAI/MiniMax-M2.7, produced with llm-compressor.

Format: compressed-tensors pack-quantized, int4 weights / fp16 activations
Group size: 128, symmetric
Calibration: data-free, MSE observer
Kept in BF16: MoE routing gates and lm_head only — every other Linear is quantized, matching the ignore list from cyankiwi/MiniMax-M2.5-AWQ-4bit.

vLLM

vllm serve demon-zombie/MiniMax-M2.7-AWQ-4bit \
  --tensor-parallel-size 4 \
  --tool-call-parser minimax_m2 \
  --reasoning-parser minimax_m2 \
  --enable-auto-tool-choice

Downloads last month: 2,639

Safetensors

Model size

229B params

Tensor type

I64

I32

BF16

Model tree for demon-zombie/MiniMax-M2.7-AWQ-4bit

Base model

MiniMaxAI/MiniMax-M2.7

Quantized

(71)

this model