Qwen 3.5 vision support

by agentsmit - opened Feb 18

Feb 18

Decided to ask here as Kimi K2.5 had vision support from AesSedai - couldnt find any references on how (if possible) to run vision on Qwen3.5 in llampa.cpp - it seems it has it built in vision and doesnt use separate mmporj file

Johnasson

Feb 18

•

edited Feb 18

Idk about that (you still need separate vision encoder it seems, but it takes at most ~2 Gb) but adding to cmdline

--mmproj mmproj-BF16.gguf

from https://huggingface.co/unsloth/Qwen3.5-397B-A17B-GGUF/blob/main/mmproj-BF16.gguf
worked for unsloth-quantized llama.cpp (llama-server for example starts to allow passing images in), maybe it will work for this quant too because GPT says the mmproj is not dependent on quantization but only on architecture and layers/input/output.

agentsmit

Feb 18

Thanks - I've totally missed that!

agentsmit changed discussion status to closed Feb 18

agentsmit

Feb 18

@AesSedai - do we need special mmproj file for your quant or we can use unsloth?

agentsmit changed discussion status to open Feb 18

AesSedai

Owner Feb 19

@agentsmit The unsloth mmproj should work fine, I'll get them uploaded to my repo later today but they'll likely end up being basically identical I'd imagine.

agentsmit changed discussion status to closed Feb 19

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment