distil-qwen35-4b - GGUF

Static quantizations.

Available Quantizations

Approximate BPW and file size in decimal GB, ordered from highest precision to lowest.

File Approx. BPW Approx. Size (GB)
distil-qwen35-4b-bf16.gguf 16.00 8.42
distil-qwen35-4b-q8_0.gguf 8.51 4.48
distil-qwen35-4b-q6_k.gguf 6.57 3.46
distil-qwen35-4b-q5_1.gguf 6.09 3.21
distil-qwen35-4b-q5_k_m.gguf 5.83 3.07
distil-qwen35-4b-q5_0.gguf 5.67 2.99
distil-qwen35-4b-q4_k_m.gguf 5.13 2.71
distil-qwen35-4b-q4_1.gguf 5.24 2.77
distil-qwen35-4b-q4_0.gguf 4.82 2.54

Benchmark Performance

Benchmark Qwen 3.5 4B (Baseline) iotaminer/distil-qwen35-4b Delta
GSM8K (math) 74.0 84.0 +10.0
ARC-Challenge 54.0 59.0 +5.0
WinoGrande 75.0 79.0 +4.0
IFEval 19.0 23.0 +4.0
TruthfulQA MC2 49.1 51.6 +2.4
HellaSwag 68.0 69.0 +1.0
MMLU-Pro 57.2 52.9 -4.3
Downloads last month
817
GGUF
Model size
4B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for RemySkye/distil-qwen35-4b-GGUF

Quantized
(1)
this model