GGUFs

#2
by maigonis - opened

Hello.

I’m testing a coding model version, and at first glance, the results seem very good—it allows me to run Q4 quantizations on my 32 GB + 6 GB server, whereas previously I could only use Q3. So thank you, REAM developers, for your work.

I’m trying to quantize this instruction model to GGUF format using only the ggml-org/gguf-my-repo tool, but it appears to crash. Could you please upload some GGUF files?

P.S. I’m also waiting for a thinking-model version.

Sign up or log in to comment