File size: 1,145 Bytes
077fe87 9e5ea09 077fe87 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | ---
license: apache-2.0
base_model: z-lab/gemma-4-31B-it-DFlash
base_model_relation: quantized
quantized_by: turboderp
tags:
- exl3
---
EXL3 quants of [gemma-4-31B-it-DFlash](https://huggingface.co/z-lab/gemma-4-31B-it-DFlash)
[2.50 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/2.50bpw)
[3.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/3.00bpw)
[3.50 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/3.50bpw)
[4.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/4.00bpw)
[5.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/5.00bpw)
[6.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/6.00bpw)
Quant | Mean acc. tokens¹
---------|------------------
2.50 bpw | 4.00
3.00 bpw | 4.07
3.50 bpw | 4.08
4.00 bpw | 4.10
5.00 bpw | 4.12
6.00 bpw | 4.10
BF16 | 4.07
¹ Mean verified tokens per 15-token draft, CatBench at temp=0, using 4.00bpw target model on current exllamav3 `dev` branch (upcoming v0.0.33) |