Using the new gguf quant method may result in a worse overall performance than that of the old gguf quants.
#2
by TheYuriLover - opened
Source : https://github.com/ggerganov/llama.cpp/discussions/5006
The problem we have when using a calibration dataset is the overfitting to a certain style and then in consequence, make the model worse on other aspects.
Supposedly, the suggestion to fix this is to use a calibration dataset composed of random tokens instead.
TheYuriLover changed discussion title from Using the new gguf quant method may result in a woese overall performance than that of the old gguf quants. to Using the new gguf quant method may result in a worse overall performance than that of the old gguf quants.
Thank you, we reverted to old llama cpp and it fixed it afaik