bad quality code generation
#1
by saadsafi - opened
Q5_K_M gives low quality code for standard tests I run. also Q5_K_S is significantly smaller than Q4_K_M.
same low quality code from "unsloth" (both UD-Q4_K_XL and UD-Q3_K_XL)
and Q4_K_M from "lmstudio-community"
using the latest llama.cpp release(s) on windows (CUDA 13.1).
something special about this model.
This model has a very weird architecture, might be that standard techniques for optimizing quants don't work very well here.