What code did you use for MXFP4 quantization?
#2
by marksverdhei - opened
The MXFP4 quantizer in hf transformers was broken until recently, so I'm guessing not that?
I'm using the llama.cpp implementation
noctrex changed discussion status to closed