Molmo2 GGUF

I won't be maintaining this model

An attempt at supporting Molmo2-4B on llama.cpp for personal use. This is a vibe coded solution so I am still benchmarking its actual performance and yet to determine best parameters. Language model on its own works fine, but seems to be increasingly lobotomized when mmproj is included. Vision modality in the mmproj requires custom modifications to llama.cpp mtmd code which I am optimizing.

Downloads last month
270
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for reubk/Molmo2-4B-GGUF

Quantized
(3)
this model