I would like to know how these gguf files are quantified?

#17

by chfm - opened Nov 7, 2025

Nov 7, 2025

I used various quantized tools, but all of them reported errors. Could you please share how you quantize the safety tensor?

Phil2Sat

Owner Nov 7, 2025

•

edited Nov 7, 2025

i hope i got it packed patched and uploaded, no guarantee at all
https://github.com/phil2sat/convert
the tool_auto.py does download checkout and patch llama.cpp if you get compile errors (WATCH FOR IT AT FIRST RUN) go into llama.cpp.auto folder and:

mkdir build           
cmake -B build
cmake --build build --config Debug -j10 --target llama-quantize
cd ..

after this the tool auto does its work:

python tool_auto.py --src  /data/models/Qwen-rapid/Qwen-Rapid-NSFW-v9.0.safetensors --output /data/models/Qwen-rapid/out/v90 --temp-dir /daten/models/Qwen-rapid/tmp --quants all

chfm

Nov 7, 2025

•

edited Nov 7, 2025

Thank you very much for your script; it's great, and it's working well—excellent.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment