I would like to know how these gguf files are quantified?
#17
by chfm - opened
I used various quantized tools, but all of them reported errors. Could you please share how you quantize the safety tensor?
i hope i got it packed patched and uploaded, no guarantee at all
https://github.com/phil2sat/convert
the tool_auto.py does download checkout and patch llama.cpp if you get compile errors (WATCH FOR IT AT FIRST RUN) go into llama.cpp.auto folder and:
mkdir build
cmake -B build
cmake --build build --config Debug -j10 --target llama-quantize
cd ..
after this the tool auto does its work:
python tool_auto.py --src /data/models/Qwen-rapid/Qwen-Rapid-NSFW-v9.0.safetensors --output /data/models/Qwen-rapid/out/v90 --temp-dir /daten/models/Qwen-rapid/tmp --quants all
Thank you very much for your script; it's great, and it's working well—excellent.