command to create GGUF MXFP4 mixed with BF16
#5
by ghit72 - opened
Hello noctrex,
I'm new to this topic. I would like to ask if you could share the command or workflow how you were able to mix MXFP4_MOE with the BF16 tensors. I played arround with different command variations of llama-quantize MXFP4_MOE and "COPY", but no success.
Thank you.
I made the changes in the source code, go to src/llama-quant.cpp, and change the section:
} else if (ftype == LLAMA_FTYPE_MOSTLY_MXFP4_MOE) {
// MoE tensors -> MXFP4
// other tensors -> Q8_0
if (tensor->ne[2] > 1) {
new_type = GGML_TYPE_MXFP4;
} else {
new_type = GGML_TYPE_Q8_0;
}
from GGML_TYPE_Q8_0 to GGML_TYPE_BF16
and compile
noctrex changed discussion status to closed