Q4_K_L metadata wrongly indicates Q4_K_M quantization

#1
by justinbowes - opened

Can't pull directly in llama.cpp. Inspecting the Q4_K_L GGUF, the quantization is shown as Q4_K_M. The filesizes are different.

image

  • Q3_K_L: not affected
  • Q4_K_L: as above
  • Q5_K_L: file_type is Q5_K_M (same issue)
  • Q6_K_L: file_type is Q6_K (similar issue)

I just spot-checked bartowski/Qwen_Qwen3.5-397B-A17B-GGUF and Q5 looks the same.

yeah that's just how it works, there is no "Q4_K_L" officially, it's Q4_K_M with input/output set to Q8_0

Sign up or log in to comment