Q4_K_L metadata wrongly indicates Q4_K_M quantization

by justinbowes - opened Mar 2

Mar 2

Can't pull directly in llama.cpp. Inspecting the Q4_K_L GGUF, the quantization is shown as Q4_K_M. The filesizes are different.

Mar 2

Mar 2

I just spot-checked bartowski/Qwen_Qwen3.5-397B-A17B-GGUF and Q5 looks the same.

Owner Mar 3

yeah that's just how it works, there is no "Q4_K_L" officially, it's Q4_K_M with input/output set to Q8_0

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment