vision capabilities?

#2
by feliscat - opened

are the vision capabilities maintained in these files? trying with llama.cpp and it doesn't work

Yes this is a vision model and does support vision if you provide the necesarry MMPROJ files.

For vision you need to provide any LLM quant for the text layers and one of the following MMPROJ quants for the vision layers:

Yes this is a vision model and does support vision if you provide the necesarry MMPROJ files.

For vision you need to provide any LLM quant for the text layers and one of the following MMPROJ quants for the vision layers:

I'm trying to run the model right now and it seems like the models from this repo can't see the input images even with the mmproj provided, maybe imatrix doesn't respect vision capabilities or something?

Nevermind, could be that llama.cpp doesn't support multimodal for this model yet maybe?

Nevermind 2x, there is something wrong with these quants it seems, for whatever reason the AIs lost vision capability and keep saying that either the image isn't here or is corrupted

I will try them myself and let you know if I can reproduce your issue. I see no reason why it wouldn't work unless there was/is an issue inside llama.cpp.

Sign up or log in to comment