ENOSYS
/

GLM-4.7-Flash-Uncensored-750-v1-GGUF

Text Generation

Mixture of Experts

Model card Files Files and versions

Experimental global target bits‑per‑weight quantization of HauhauCS/GLM-4.7-Flash-Uncensored-HauhauCS-Aggressive

Using non-standard (forked) LLaMA C++ branch for quantization.
Using a CLI tool to build KLD evaluation and imatrix calibration datasets for GGUF models, sourced from eaddario/imatrix-calibration.
Using dataset sources: tools, math, code, text_en, text_ru.
Using dataset chunks: 750.
Tensors quantinization F16 instead of BF16, Nvidia Pascal architecture friendly like P100.
Small set of patches added.

Many thanks to Ed Addario for an impressive job.

Quantization comparison

BPW/TGS	PPL correlation	PPL mean ratio	ΔPPL	Mean KLD	Median KLD	Maximum KLD	99.9% KLD	Mean Δp	RMS Δp
3.50	93.18%	1.205301 ± 0.003693	3.066378 ± 0.059410	0.346790 ± 0.003130	0.113532	36.079124	17.036463	-1.548 ± 0.027 %	12.313 ± 0.063 %
4.00	92.50%	1.081636 ± 0.003399	1.219317 ± 0.050468	0.350924 ± 0.003492	0.094924	35.051384	19.111687	-0.644 ± 0.027 %	11.874 ± 0.066 %
4.50	93.92%	1.159525 ± 0.003402	2.382673 ± 0.054462	0.212138 ± 0.003037	0.035486	37.016945	19.404484	-0.722 ± 0.019 %	8.384 ± 0.070 %
5.00	94.32%	1.030399 ± 0.002807	0.454046 ± 0.041667	0.223456 ± 0.003202	0.029363	33.802094	20.041710	-0.307 ± 0.018 %	7.986 ± 0.072 %
5.50	93.59%	0.970038 ± 0.002771	-0.447509 ± 0.042194	0.234948 ± 0.003451	0.024535	32.256840	19.587420	0.123 ± 0.020 %	8.691 ± 0.081 %
6.00	96.85%	1.028155 ± 0.002101	0.420519 ± 0.031506	0.107335 ± 0.002290	0.008626	36.048412	17.149174	-0.060 ± 0.012 %	5.211 ± 0.072 %
6.50	97.55%	1.037597 ± 0.001880	0.561555 ± 0.028552	0.080116 ± 0.001919	0.007975	32.534607	14.545952	-0.128 ± 0.011 %	4.691 ± 0.069 %
7.00	96.92%	1.015746 ± 0.002049	0.235178 ± 0.030606	0.099637 ± 0.002328	0.003383	35.733624	17.560083	0.007 ± 0.010 %	4.330 ± 0.078 %
7.50	97.57%	1.030245 ± 0.001857	0.451735 ± 0.028043	0.067106 ± 0.001900	0.002916	37.250160	14.828805	-0.069 ± 0.009 %	3.827 ± 0.077 %
8.00	97.42%	1.022089 ± 0.001892	0.329923 ± 0.028377	0.077545 ± 0.002062	0.002760	33.393574	16.344349	-0.035 ± 0.009 %	3.835 ± 0.077 %
8.50	97.26%	1.026630 ± 0.001957	0.397746 ± 0.029396	0.082303 ± 0.002123	0.002307	31.664230	17.153591	-0.049 ± 0.009 %	3.815 ± 0.079 %
9.00	98.34%	1.019983 ± 0.001520	0.298461 ± 0.022937	0.044058 ± 0.001500	0.001003	35.295940	10.766310	-0.013 ± 0.007 %	3.132 ± 0.075 %
9.50	98.27%	1.010330 ± 0.001530	0.154286 ± 0.022915	0.051995 ± 0.001669	0.000858	32.079849	13.546452	0.032 ± 0.007 %	3.195 ± 0.078 %
10.00	98.40%	1.013286 ± 0.001478	0.198433 ± 0.022200	0.044456 ± 0.001528	0.000833	31.551548	12.255160	0.002 ± 0.007 %	3.022 ± 0.076 %
10.50	98.30%	1.012990 ± 0.001525	0.194020 ± 0.022882	0.047429 ± 0.001597	0.000826	33.701038	13.457508	0.019 ± 0.007 %	3.073 ± 0.078 %
11.00	98.35%	1.019113 ± 0.001514	0.285470 ± 0.022865	0.042238 ± 0.001490	0.000819	31.194330	11.399836	0.001 ± 0.006 %	2.878 ± 0.075 %

Downloads last month: 9,430

GGUF

Model size

30B params

Architecture

deepseek2

Hardware compatibility

Log In to add your hardware

We're not able to determine the quantization variants.

View all variants

Model tree for ENOSYS/GLM-4.7-Flash-Uncensored-750-v1-GGUF

Base model

HauhauCS/GLM-4.7-Flash-Uncensored-HauhauCS-Aggressive

Quantized

(1)

this model

Dataset used to train ENOSYS/GLM-4.7-Flash-Uncensored-750-v1-GGUF