Experimental global target bits‑per‑weight quantization of unsloth/Qwen3-Embedding-0.6B
- Using non-standard (forked) LLaMA C++ branch for quantization.
- Using a CLI tool to build KLD evaluation and imatrix calibration datasets for GGUF models, sourced from eaddario/imatrix-calibration.
- Using dataset sources: tools, text_en, text_ru.
- Using dataset chunks: 250.
- Tensors quantinization F16 instead of BF16, Nvidia Pascal architecture friendly like P100.
- Small set of patches added.
Many thanks to Ed Addario for an impressive job.
Quantization comparison
| BPW | PPL correlation | PPL mean ratio | ΔPPL | Mean KLD | Maximum KLD | 99.9% KLD | Mean Δp | RMS Δp |
|---|---|---|---|---|---|---|---|---|
| 5.00 | 97.51% | 1.229589 ± 0.004913 | 123.695152 ± 3.455547 | 0.254131 ± 0.001247 | 13.281865 | 3.295894 | -0.945 ± 0.030 % | 7.861 ± 0.073 % |
| 5.25 | 97.93% | 1.198584 ± 0.004387 | 106.990729 ± 3.089464 | 0.201051 ± 0.001020 | 16.191372 | 2.680833 | -0.726 ± 0.027 % | 7.070 ± 0.070 % |
| 5.30 | 98.05% | 1.199072 ± 0.004269 | 107.253986 ± 3.067627 | 0.181907 ± 0.000926 | 11.642469 | 2.371547 | -0.680 ± 0.026 % | 6.811 ± 0.067 % |
| 5.50 | 98.42% | 1.143577 ± 0.003665 | 77.354913 ± 2.488164 | 0.132384 ± 0.000693 | 9.056372 | 1.854777 | -0.483 ± 0.022 % | 5.860 ± 0.061 % |
| 5.75 | 98.67% | 1.107662 ± 0.003264 | 58.005137 ± 2.117964 | 0.097616 ± 0.000599 | 11.806184 | 1.765460 | -0.368 ± 0.019 % | 5.023 ± 0.062 % |
| 5.80 | 98.72% | 1.117510 ± 0.003238 | 63.310457 ± 2.168267 | 0.092513 ± 0.000578 | 10.540298 | 1.608747 | -0.386 ± 0.019 % | 4.906 ± 0.059 % |
| 6.00 | 98.71% | 1.125959 ± 0.003273 | 67.862792 ± 2.234984 | 0.092148 ± 0.000585 | 12.128985 | 1.559407 | -0.435 ± 0.019 % | 4.931 ± 0.061 % |
| 6.25 | 98.89% | 1.082024 ± 0.002912 | 44.191950 ± 1.831029 | 0.067634 ± 0.000431 | 7.849868 | 1.006397 | -0.264 ± 0.016 % | 4.210 ± 0.055 % |
| 6.30 | 98.94% | 1.089430 ± 0.002886 | 48.181944 ± 1.865568 | 0.062879 ± 0.000377 | 7.467944 | 0.950639 | -0.262 ± 0.016 % | 4.112 ± 0.055 % |
| 6.50 | 99.07% | 1.110195 ± 0.002773 | 59.369600 ± 1.962020 | 0.046909 ± 0.000273 | 5.759321 | 0.702734 | -0.293 ± 0.013 % | 3.510 ± 0.044 % |
| 6.75 | 99.22% | 1.071155 ± 0.002479 | 38.336145 ± 1.614146 | 0.027021 ± 0.000217 | 6.292867 | 0.423712 | -0.069 ± 0.011 % | 2.756 ± 0.051 % |
| 6.80 | 99.23% | 1.079566 ± 0.002483 | 42.867759 ± 1.667153 | 0.026098 ± 0.000205 | 6.737493 | 0.397282 | -0.098 ± 0.010 % | 2.714 ± 0.051 % |
| 7.00 | 99.24% | 1.083987 ± 0.002490 | 45.249386 ± 1.706124 | 0.023685 ± 0.000190 | 6.608492 | 0.377540 | -0.084 ± 0.010 % | 2.559 ± 0.049 % |
| 7.25 | 99.27% | 1.088304 ± 0.002447 | 47.575202 ± 1.717173 | 0.019738 ± 0.000130 | 2.865925 | 0.332578 | -0.090 ± 0.009 % | 2.290 ± 0.037 % |
| 7.30 | 99.28% | 1.084746 ± 0.002430 | 45.658594 ± 1.686292 | 0.019120 ± 0.000148 | 4.343514 | 0.295245 | -0.092 ± 0.009 % | 2.291 ± 0.048 % |
| 7.50 | 99.29% | 1.085544 ± 0.002411 | 46.088344 ± 1.685529 | 0.017551 ± 0.000142 | 4.905227 | 0.298486 | -0.078 ± 0.008 % | 2.215 ± 0.050 % |
| 7.75 | 99.32% | 1.091225 ± 0.002385 | 49.149079 ± 1.714694 | 0.014173 ± 0.000128 | 4.346512 | 0.237379 | -0.101 ± 0.008 % | 2.013 ± 0.052 % |
| 7.80 | 99.29% | 1.083209 ± 0.002410 | 44.830380 ± 1.669455 | 0.016992 ± 0.000133 | 3.100509 | 0.285571 | -0.077 ± 0.008 % | 2.162 ± 0.041 % |
| 8.00 | 99.31% | 1.080260 ± 0.002373 | 43.241571 ± 1.636024 | 0.015367 ± 0.000130 | 4.035001 | 0.269038 | -0.060 ± 0.008 % | 2.071 ± 0.049 % |
| 8.25 | 99.33% | 1.088309 ± 0.002352 | 47.578112 ± 1.683232 | 0.012024 ± 0.000105 | 4.370751 | 0.192501 | -0.089 ± 0.007 % | 1.860 ± 0.044 % |
| 8.30 | 99.34% | 1.081929 ± 0.002328 | 44.140973 ± 1.628892 | 0.011498 ± 0.000084 | 1.861967 | 0.182563 | -0.075 ± 0.007 % | 1.784 ± 0.037 % |
| 8.50 | 99.36% | 1.078487 ± 0.002286 | 42.286422 ± 1.587902 | 0.009095 ± 0.000081 | 2.475720 | 0.147575 | -0.064 ± 0.006 % | 1.635 ± 0.048 % |
| 8.75 | 99.37% | 1.078905 ± 0.002274 | 42.511791 ± 1.587059 | 0.007852 ± 0.000073 | 2.163608 | 0.118868 | -0.054 ± 0.006 % | 1.535 ± 0.049 % |
| 8.80 | 99.37% | 1.078538 ± 0.002273 | 42.314029 ± 1.583733 | 0.007742 ± 0.000079 | 2.609957 | 0.127498 | -0.053 ± 0.006 % | 1.533 ± 0.053 % |
| 9.00 | 99.37% | 1.077841 ± 0.002265 | 41.938514 ± 1.575722 | 0.007380 ± 0.000073 | 2.547039 | 0.125235 | -0.050 ± 0.006 % | 1.499 ± 0.053 % |
| 9.25 | 99.37% | 1.075351 ± 0.002256 | 40.596804 ± 1.555539 | 0.006905 ± 0.000066 | 2.041534 | 0.103068 | -0.042 ± 0.006 % | 1.448 ± 0.044 % |
| 9.30 | 99.37% | 1.071370 ± 0.002244 | 38.451639 ± 1.524417 | 0.006834 ± 0.000062 | 1.959199 | 0.105753 | -0.031 ± 0.006 % | 1.444 ± 0.040 % |
| 9.50 | 99.38% | 1.073834 ± 0.002244 | 39.779313 ± 1.540846 | 0.006470 ± 0.000062 | 1.991340 | 0.098428 | -0.041 ± 0.005 % | 1.400 ± 0.041 % |
| 9.75 | 99.38% | 1.074517 ± 0.002240 | 40.147460 ± 1.543907 | 0.006206 ± 0.000068 | 2.638701 | 0.101805 | -0.038 ± 0.005 % | 1.388 ± 0.053 % |
| 9.80 | 99.38% | 1.076663 ± 0.002244 | 41.303655 ± 1.559774 | 0.006146 ± 0.000069 | 2.405736 | 0.094807 | -0.044 ± 0.005 % | 1.401 ± 0.057 % |
| 10.00 | 99.38% | 1.076016 ± 0.002240 | 40.955259 ± 1.553864 | 0.005796 ± 0.000055 | 2.059185 | 0.095681 | -0.049 ± 0.005 % | 1.313 ± 0.039 % |
| 10.25 | 99.39% | 1.074654 ± 0.002230 | 40.220957 ± 1.540922 | 0.005526 ± 0.000061 | 2.114943 | 0.099634 | -0.043 ± 0.005 % | 1.315 ± 0.050 % |
| 10.30 | 99.39% | 1.072069 ± 0.002222 | 38.828737 ± 1.519978 | 0.005418 ± 0.000059 | 2.062696 | 0.094137 | -0.038 ± 0.005 % | 1.288 ± 0.047 % |
| 10.50 | 99.39% | 1.070241 ± 0.002212 | 37.843815 ± 1.503987 | 0.005140 ± 0.000048 | 1.621503 | 0.087822 | -0.032 ± 0.005 % | 1.218 ± 0.028 % |
| 10.75 | 99.39% | 1.066075 ± 0.002195 | 35.599290 ± 1.470596 | 0.004415 ± 0.000060 | 2.666409 | 0.074945 | -0.011 ± 0.004 % | 1.152 ± 0.049 % |
| 10.80 | 99.40% | 1.066699 ± 0.002195 | 35.935359 ± 1.473675 | 0.004566 ± 0.000051 | 2.669465 | 0.074244 | -0.025 ± 0.004 % | 1.109 ± 0.017 % |
| 11.00 | 99.40% | 1.071004 ± 0.002207 | 38.254894 ± 1.509298 | 0.004029 ± 0.000035 | 1.280535 | 0.060520 | -0.019 ± 0.004 % | 1.112 ± 0.045 % |
| 11.25 | 99.40% | 1.071390 ± 0.002201 | 38.462580 ± 1.508764 | 0.003687 ± 0.000024 | 0.614354 | 0.054238 | -0.025 ± 0.004 % | 0.995 ± 0.011 % |
| 11.30 | 99.40% | 1.071772 ± 0.002202 | 38.668624 ± 1.511744 | 0.003665 ± 0.000024 | 0.613789 | 0.055209 | -0.023 ± 0.004 % | 0.984 ± 0.011 % |
| 11.50 | 99.40% | 1.072891 ± 0.002200 | 39.271504 ± 1.518324 | 0.003498 ± 0.000022 | 0.426572 | 0.052573 | -0.031 ± 0.004 % | 0.977 ± 0.013 % |
| 11.75 | 99.41% | 1.067707 ± 0.002184 | 36.478332 ± 1.478998 | 0.002967 ± 0.000023 | 0.803354 | 0.045490 | -0.001 ± 0.003 % | 0.895 ± 0.013 % |
| 11.80 | 99.41% | 1.065799 ± 0.002180 | 35.450545 ± 1.465811 | 0.002931 ± 0.000021 | 0.697018 | 0.045669 | 0.006 ± 0.003 % | 0.899 ± 0.018 % |
- Downloads last month
- 18,294
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.