turboderp
/

Qwen3.6-27B-DFlash-exl3

Model card Files Files and versions

Qwen3.6-27B-DFlash-exl3 / README.md

turboderp's picture

Update README.md

0e09615 verified 2 days ago

|

history blame contribute delete

1.06 kB

	---
	license: mit
	base_model: z-lab/Qwen3.6-27B-DFlash
	base_model_relation: quantized
	quantized_by: turboderp
	tags:
	- exl3
	---

	EXL3 quants of [Qwen3.6-27B-DFlash](https://huggingface.co/z-lab/Qwen3.6-27B-DFlash)

	[2.50 bits per weight](https://huggingface.co/turboderp/Qwen3.6-27B-DFlash-exl3/tree/2.50bpw)
	[3.00 bits per weight](https://huggingface.co/turboderp/Qwen3.6-27B-DFlash-exl3/tree/3.00bpw)
	[3.50 bits per weight](https://huggingface.co/turboderp/Qwen3.6-27B-DFlash-exl3/tree/3.50bpw)
	[4.00 bits per weight](https://huggingface.co/turboderp/Qwen3.6-27B-DFlash-exl3/tree/4.00bpw)
	[5.00 bits per weight](https://huggingface.co/turboderp/Qwen3.6-27B-DFlash-exl3/tree/5.00bpw)
	[6.00 bits per weight](https://huggingface.co/turboderp/Qwen3.6-27B-DFlash-exl3/tree/6.00bpw)

	Quant \| Mean acc. tokens¹
	---------\|------------------
	2.50 bpw \| 4.04
	3.00 bpw \| 4.34
	3.50 bpw \| 4.22
	4.00 bpw \| 4.46
	5.00 bpw \| 4.36
	6.00 bpw \| 4.43
	BF16 \| 4.12

	¹ Mean verified tokens per 15-token draft, CatBench at temp=0, using 4.15bpw target model