Qwen3.5-2B-f32-GGUF

Qwen3.5-2B from Alibaba's Qwen team is a compact 2B-parameter dense multimodal causal language model with vision encoder, part of the Qwen3.5 small series (0.8B-9B), featuring a hybrid Gated DeltaNet architecture (3:1 linear attention to softmax blocks for constant memory at 262K native context, extensible to 1M tokens), multi-token prediction, and 248K vocabulary spanning 201 languages. Trained via early-fusion multimodal pre-training and post-training, it delivers impressive OCRBench (84.5), VideoMME (75.6), and thinking-mode boosts on MMLU-Pro (66.5 vs 55.3 non-thinking), IFEval (78.6), outperforming prior 7B models while fitting in ~4GB VRAM (BF16) or 1.5GB (4-bit) for edge deployment on laptops, mobiles, or Pi-class devices. Supporting text/images/video natively with toggleable thinking for reasoning vs speed trade-offs, Apache 2.0-licensed for fine-tuning, it excels as an efficient agent foundation for constrained hardware needing OCR, video understanding, coding, and multilingual reasoning.

Model Files

File Name Quant Type File Size File Link
Qwen3.5-2B.BF16.gguf BF16 3.78 GB Download
Qwen3.5-2B.F16.gguf F16 3.78 GB Download
Qwen3.5-2B.F32.gguf F32 7.54 GB Download
Qwen3.5-2B.Q8_0.gguf Q8_0 2.01 GB Download
Qwen3.5-2B.mmproj-bf16.gguf mmproj-bf16 671 MB Download
Qwen3.5-2B.mmproj-f16.gguf mmproj-f16 671 MB Download
Qwen3.5-2B.mmproj-f32.gguf mmproj-f32 1.33 GB Download
Qwen3.5-2B.mmproj-q8_0.gguf mmproj-q8_0 365 MB Download

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Downloads last month
248
GGUF
Model size
2B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Qwen3.5-2B-f32-GGUF

Finetuned
Qwen/Qwen3.5-2B
Quantized
(80)
this model

Collection including prithivMLmods/Qwen3.5-2B-f32-GGUF