Will it be converted to ggml q4?
#1
by ai2p - opened
to run with llama.cpp
ai2p changed discussion title from Will me ggml q4 version? to Will it be converted to ggml q4?
I've done a GPTQ 4bit version for GPU inference here: https://huggingface.co/TheBloke/medalpaca-13B-GPTQ-4bit
Tomorrow I'll look at GGMLs also
I tried to convert with llama.cpp utils but get an error.
Oh yeah I forgot about this. I'll see what I can do.
Sorry. Finally I did it just removing the two optimizer/scheduler .pt
I have done a set of GGMLs here, using latest llama.cpp: https://huggingface.co/TheBloke/medalpaca-13B-GGML