Molmo2 GGUF

I won't be maintaining this model

An attempt at supporting Molmo2-4B on llama.cpp for personal use. This is a vibe coded solution so I am still benchmarking its actual performance and yet to determine best parameters. Language model on its own works fine, but seems to be increasingly lobotomized when mmproj is included. Vision modality in the mmproj requires custom modifications to llama.cpp mtmd code which I am optimizing.

Downloads last month: 270

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

4-bit

8-bit

16-bit

Model tree for reubk/Molmo2-4B-GGUF

Base model

Qwen/Qwen3-4B-Instruct-2507

Finetuned

allenai/Molmo2-4B

Quantized

(3)

this model