| --- |
| license: apache-2.0 |
| base_model: z-lab/gemma-4-31B-it-DFlash |
| base_model_relation: quantized |
| quantized_by: turboderp |
| tags: |
| - exl3 |
| --- |
| |
| EXL3 quants of [gemma-4-31B-it-DFlash](https://huggingface.co/z-lab/gemma-4-31B-it-DFlash) |
|
|
| [2.50 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/2.50bpw) |
| [3.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/3.00bpw) |
| [3.50 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/3.50bpw) |
| [4.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/4.00bpw) |
| [5.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/5.00bpw) |
| [6.00 bits per weight](https://huggingface.co/turboderp/gemma4-31b-it-DFlash-exl3/tree/6.00bpw) |
|
|
| Quant | Mean acc. tokens¹ |
| ---------|------------------ |
| 2.50 bpw | 4.00 |
| 3.00 bpw | 4.07 |
| 3.50 bpw | 4.08 |
| 4.00 bpw | 4.10 |
| 5.00 bpw | 4.12 |
| 6.00 bpw | 4.10 |
| BF16 | 4.07 |
|
|
| ¹ Mean verified tokens per 15-token draft, CatBench at temp=0, using 4.00bpw target model on current exllamav3 `dev` branch (upcoming v0.0.33) |