This is a imatrix MXFP4_MOE quantization of the model GLM-4.6V, based on the imatrix from unsloth.

Get the latest llama.cpp in order to run it.

For the mmproj, try to use the largest version you can fit in memory in order to get the best results.
F32 > BF16 > F16 > Q8_0.

Downloads last month
34
GGUF
Model size
107B params
Architecture
glm4moe
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for noctrex/GLM-4.6V-MXFP4_MOE-GGUF

Base model

zai-org/GLM-4.6V
Quantized
(18)
this model