vision capabilities?

by feliscat - opened Jul 6, 2025

Discussion

feliscat

Jul 6, 2025

are the vision capabilities maintained in these files? trying with llama.cpp and it doesn't work

nicoboss

Jul 7, 2025

•

edited Jul 7, 2025

Yes this is a vision model and does support vision if you provide the necesarry MMPROJ files.

For vision you need to provide any LLM quant for the text layers and one of the following MMPROJ quants for the vision layers:

atdc1

Jul 18, 2025

Yes this is a vision model and does support vision if you provide the necesarry MMPROJ files.

For vision you need to provide any LLM quant for the text layers and one of the following MMPROJ quants for the vision layers:

https://huggingface.co/mradermacher/InternVL3-38B-Instruct-GGUF/blob/main/InternVL3-38B-Instruct.mmproj-Q8_0.gguf

https://huggingface.co/mradermacher/InternVL3-38B-Instruct-GGUF/blob/main/InternVL3-38B-Instruct.mmproj-f16.gguf

I'm trying to run the model right now and it seems like the models from this repo can't see the input images even with the mmproj provided, maybe imatrix doesn't respect vision capabilities or something?

atdc1

Jul 18, 2025

Nevermind, could be that llama.cpp doesn't support multimodal for this model yet maybe?

atdc1

Jul 24, 2025

Nevermind 2x, there is something wrong with these quants it seems, for whatever reason the AIs lost vision capability and keep saying that either the image isn't here or is corrupted

nicoboss

Jul 24, 2025

I will try them myself and let you know if I can reproduce your issue. I see no reason why it wouldn't work unless there was/is an issue inside llama.cpp.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment