This is a MXFP4_MOE quantization of the model GLM-4.7-Flash.

The suggested parameters from the official docs for general chat are:

--temp 1.0
--top-p 0.95
--min-p 0.01
--repeat-penalty 1.0

And for tool-calling:

--temp 0.7
--top-p 1.0
--min-p 0.01
--repeat-penalty 1.0
Downloads last month
529
GGUF
Model size
30B params
Architecture
deepseek2
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for noctrex/GLM-4.7-Flash-MXFP4_MOE-GGUF

Quantized
(81)
this model