how to convert and quantize for llama.cpp ? already quantized gives error
#2
by froilo - opened
how to convert and quantize for llama.cpp ? already quantized gives error
What error are you seeing?
I use ggml (https://github.com/ggerganov/ggml/), the gpt2 binary to run this