Using the new gguf quant method may result in a worse overall performance than that of the old gguf quants.

by TheYuriLover - opened Jan 18, 2024

Jan 18, 2024

•

edited Jan 18, 2024

Source : https://github.com/ggerganov/llama.cpp/discussions/5006
The problem we have when using a calibration dataset is the overfitting to a certain style and then in consequence, make the model worse on other aspects.

Supposedly, the suggestion to fix this is to use a calibration dataset composed of random tokens instead.

TheYuriLover changed discussion title from Using the new gguf quant method may result in a woese overall performance than that of the old gguf quants. to Using the new gguf quant method may result in a worse overall performance than that of the old gguf quants. Jan 18, 2024

teknium

NousResearch org Jan 18, 2024

Thank you, we reverted to old llama cpp and it fixed it afaik

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment