turboderp commited on
Commit
077fe87
·
verified ·
1 Parent(s): 5b05794

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: z-lab/gemma-4-31B-it-DFlash
4
+ base_model_relation: quantized
5
+ quantized_by: turboderp
6
+ tags:
7
+ - exl3
8
+ ---
9
+
10
+ EXL3 quants of [gemma-4-31B-it-DFlash](https://huggingface.co/z-lab/gemma-4-31B-it-DFlash)
11
+
12
+ [2.50 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/2.50bpw)
13
+ [3.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/3.00bpw)
14
+ [3.50 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/3.50bpw)
15
+ [4.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/4.00bpw)
16
+ [5.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/5.00bpw)
17
+ [6.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/6.00bpw)
18
+
19
+ Quant | Mean acc. tokens¹
20
+ ---------|------------------
21
+ 2.50 bpw | 4.00
22
+ 3.00 bpw | 4.07
23
+ 3.50 bpw | 4.08
24
+ 4.00 bpw | 4.10
25
+ 5.00 bpw | 4.12
26
+ 6.00 bpw | 4.10
27
+ BF16 | 4.07
28
+
29
+ ¹ Mean verified tokens per 15-token draft, CatBench at temp=0, using 4.00bpw target model on current exllamav3 `dev` branch (upcoming v0.0.33)