GGUF models please?

by ONTHEREDTEAM - opened Jan 22

Jan 22

Llama CPP was updated to repack Q4_K variants for optimization. Inference speed after loading and perplexity were tested to be of higher quality, but Q4_K_M is still slow. IQ4_NL and Q4_K_S are very similar now. So now I'm mostly downloading iMatrixed Q4_K_S GGUFs.

This is one of the higher 8B models on

https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

And I would like to test it out.
Thank you.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment