MiniMax-M2.7 AWQ 4bit (W4A16)
W4A16 quantization of MiniMaxAI/MiniMax-M2.7,
produced with llm-compressor.
- Format:
compressed-tensorspack-quantized, int4 weights / fp16 activations - Group size: 128, symmetric
- Calibration: data-free, MSE observer
- Kept in BF16: MoE routing gates and
lm_headonly — every otherLinearis quantized, matching the ignore list fromcyankiwi/MiniMax-M2.5-AWQ-4bit.
vLLM
vllm serve demon-zombie/MiniMax-M2.7-AWQ-4bit \
--tensor-parallel-size 4 \
--tool-call-parser minimax_m2 \
--reasoning-parser minimax_m2 \
--enable-auto-tool-choice
- Downloads last month
- 2,639
Model tree for demon-zombie/MiniMax-M2.7-AWQ-4bit
Base model
MiniMaxAI/MiniMax-M2.7