RemySkye
/

distil-qwen35-4b-GGUF

Text Generation

efficient-inference

Model card Files Files and versions

distil-qwen35-4b - GGUF

Static quantizations.

Available Quantizations

Approximate BPW and file size in decimal GB, ordered from highest precision to lowest.

File	Approx. BPW	Approx. Size (GB)
`distil-qwen35-4b-bf16.gguf`	16.00	8.42
`distil-qwen35-4b-q8_0.gguf`	8.51	4.48
`distil-qwen35-4b-q6_k.gguf`	6.57	3.46
`distil-qwen35-4b-q5_1.gguf`	6.09	3.21
`distil-qwen35-4b-q5_k_m.gguf`	5.83	3.07
`distil-qwen35-4b-q5_0.gguf`	5.67	2.99
`distil-qwen35-4b-q4_k_m.gguf`	5.13	2.71
`distil-qwen35-4b-q4_1.gguf`	5.24	2.77
`distil-qwen35-4b-q4_0.gguf`	4.82	2.54

Benchmark Performance

Benchmark	Qwen 3.5 4B (Baseline)	iotaminer/distil-qwen35-4b	Delta
GSM8K (math)	74.0	84.0	+10.0
ARC-Challenge	54.0	59.0	+5.0
WinoGrande	75.0	79.0	+4.0
IFEval	19.0	23.0	+4.0
TruthfulQA MC2	49.1	51.6	+2.4
HellaSwag	68.0	69.0	+1.0
MMLU-Pro	57.2	52.9	-4.3

Downloads last month: 817

GGUF

Model size

4B params

Architecture

qwen35

Hardware compatibility

Log In to add your hardware

4-bit

5-bit

6-bit

8-bit

16-bit

Model tree for RemySkye/distil-qwen35-4b-GGUF

Base model

iotaminer/distil-qwen35-4b

Quantized

(1)

this model