This is a MXFP4_MOE quantization of the model GLM-4.7-Flash.

The suggested parameters from the official docs for general chat are:

--temp 1.0
--top-p 0.95
--min-p 0.01
--repeat-penalty 1.0

And for tool-calling:

--temp 0.7
--top-p 1.0
--min-p 0.01
--repeat-penalty 1.0

GGUF

Model size

30B params

Architecture

deepseek2

Hardware compatibility

4-bit

Model tree for noctrex/GLM-4.7-Flash-MXFP4_MOE-GGUF

Base model

Quantized

(81)

this model