Experimental global target bits‑per‑weight quantization of HauhauCS/GLM-4.7-Flash-Uncensored-HauhauCS-Aggressive
- Using non-standard (forked) LLaMA C++ branch for quantization.
- Using a CLI tool to build KLD evaluation and imatrix calibration datasets for GGUF models, sourced from eaddario/imatrix-calibration.
- Using dataset sources: tools, math, code, text_en, text_ru.
- Using dataset chunks: 750.
- Tensors quantinization F16 instead of BF16, Nvidia Pascal architecture friendly like P100.
- Small set of patches added.
Many thanks to Ed Addario for an impressive job.
Quantization comparison
| BPW/TGS | PPL correlation | PPL mean ratio | ΔPPL | Mean KLD | Median KLD | Maximum KLD | 99.9% KLD | Mean Δp | RMS Δp |
|---|---|---|---|---|---|---|---|---|---|
| 3.50 | 93.18% | 1.205301 ± 0.003693 | 3.066378 ± 0.059410 | 0.346790 ± 0.003130 | 0.113532 | 36.079124 | 17.036463 | -1.548 ± 0.027 % | 12.313 ± 0.063 % |
| 4.00 | 92.50% | 1.081636 ± 0.003399 | 1.219317 ± 0.050468 | 0.350924 ± 0.003492 | 0.094924 | 35.051384 | 19.111687 | -0.644 ± 0.027 % | 11.874 ± 0.066 % |
| 4.50 | 93.92% | 1.159525 ± 0.003402 | 2.382673 ± 0.054462 | 0.212138 ± 0.003037 | 0.035486 | 37.016945 | 19.404484 | -0.722 ± 0.019 % | 8.384 ± 0.070 % |
| 5.00 | 94.32% | 1.030399 ± 0.002807 | 0.454046 ± 0.041667 | 0.223456 ± 0.003202 | 0.029363 | 33.802094 | 20.041710 | -0.307 ± 0.018 % | 7.986 ± 0.072 % |
| 5.50 | 93.59% | 0.970038 ± 0.002771 | -0.447509 ± 0.042194 | 0.234948 ± 0.003451 | 0.024535 | 32.256840 | 19.587420 | 0.123 ± 0.020 % | 8.691 ± 0.081 % |
| 6.00 | 96.85% | 1.028155 ± 0.002101 | 0.420519 ± 0.031506 | 0.107335 ± 0.002290 | 0.008626 | 36.048412 | 17.149174 | -0.060 ± 0.012 % | 5.211 ± 0.072 % |
| 6.50 | 97.55% | 1.037597 ± 0.001880 | 0.561555 ± 0.028552 | 0.080116 ± 0.001919 | 0.007975 | 32.534607 | 14.545952 | -0.128 ± 0.011 % | 4.691 ± 0.069 % |
| 7.00 | 96.92% | 1.015746 ± 0.002049 | 0.235178 ± 0.030606 | 0.099637 ± 0.002328 | 0.003383 | 35.733624 | 17.560083 | 0.007 ± 0.010 % | 4.330 ± 0.078 % |
| 7.50 | 97.57% | 1.030245 ± 0.001857 | 0.451735 ± 0.028043 | 0.067106 ± 0.001900 | 0.002916 | 37.250160 | 14.828805 | -0.069 ± 0.009 % | 3.827 ± 0.077 % |
| 8.00 | 97.42% | 1.022089 ± 0.001892 | 0.329923 ± 0.028377 | 0.077545 ± 0.002062 | 0.002760 | 33.393574 | 16.344349 | -0.035 ± 0.009 % | 3.835 ± 0.077 % |
| 8.50 | 97.26% | 1.026630 ± 0.001957 | 0.397746 ± 0.029396 | 0.082303 ± 0.002123 | 0.002307 | 31.664230 | 17.153591 | -0.049 ± 0.009 % | 3.815 ± 0.079 % |
| 9.00 | 98.34% | 1.019983 ± 0.001520 | 0.298461 ± 0.022937 | 0.044058 ± 0.001500 | 0.001003 | 35.295940 | 10.766310 | -0.013 ± 0.007 % | 3.132 ± 0.075 % |
| 9.50 | 98.27% | 1.010330 ± 0.001530 | 0.154286 ± 0.022915 | 0.051995 ± 0.001669 | 0.000858 | 32.079849 | 13.546452 | 0.032 ± 0.007 % | 3.195 ± 0.078 % |
| 10.00 | 98.40% | 1.013286 ± 0.001478 | 0.198433 ± 0.022200 | 0.044456 ± 0.001528 | 0.000833 | 31.551548 | 12.255160 | 0.002 ± 0.007 % | 3.022 ± 0.076 % |
| 10.50 | 98.30% | 1.012990 ± 0.001525 | 0.194020 ± 0.022882 | 0.047429 ± 0.001597 | 0.000826 | 33.701038 | 13.457508 | 0.019 ± 0.007 % | 3.073 ± 0.078 % |
| 11.00 | 98.35% | 1.019113 ± 0.001514 | 0.285470 ± 0.022865 | 0.042238 ± 0.001490 | 0.000819 | 31.194330 | 11.399836 | 0.001 ± 0.006 % | 2.878 ± 0.075 % |
- Downloads last month
- 9,430
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.