Image input not working?

by Garpez - opened 18 days ago

I'm using the UD-IQ3_XXS quant on llama.cpp, but the viewing capabilities of the model don't seem to be working. Am I missing something?

McUH

18 days ago

It works for me with Koboldcpp, with Q8 or Q6 quants and I used the BF16 mmproj file. I do not know how llama.cpp works directly, do you have mmproj file loaded?

Prof684

17 days ago

image input also not working for me. I'm using Q5 version on llama.cpp

HansMeier0001

16 days ago

Reasoning doesn't work for me with this model. It does with the official original gemma-4.

McUH

16 days ago

•

edited 16 days ago

Reasoning works well with all I tried - Q8 and Q6 from here and Q4KL from bartowski (Koboldcpp + SillyTavern + text completion with my template). But you need to have correct template, which is bit complex with this model. There were initial problems in llamacpp though with BOS token etc, so that could be issue too. Good system prompt helps as well to instruct what it should actually reason about. I did not try <4bpw quants though but <4bpw reasoning models never worked well for me in the past anyway.
Make sure that beside channel/thought tokens you also have <|think|> token at the beginning of first system prompt, otherwise it might refuse to reason despite everything else set correctly.

HansMeier0001

16 days ago

•

edited 16 days ago

Reasoning works with google/gemma-4-31b, but not with unsloth/gemma-4-31b-it.
Shouldn't they both require the same settings?
In LM Studio google/gemma-4-31b is shown as having reasoning capabilities, but unsloth/gemma-4-31b-it doesn't. unsloth/gemma-4-31b-it also doesn't think when I use it. (Every model can do the fake think in Silly btw. It's really more like taking notes.)

Edit: I had to do this to enable think https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF/discussions/6

McUH

16 days ago

I don't use LM studio so can't help there. Just for reference - Q8, Q6, and I think I also tried Q5KM from this repository - they all worked well with reasoning reliably. So when they don't work, it is probably wrong setup/prompt. Can't say about other quants from this repo as I did not try those.
Also, google/gemma-4-31b (without -it) is base model as far as I see. It should not even use any instruct templates, it is just text competition model/ model for further post-training. I am surprised if that even worked for you somehow. google/gemma-4-31b-it is the instruct model and this repo is quant from that. The -it version is the one you should use, unless you know what you are doing (eg you want base model for some reason).

Garpez changed discussion status to closed 12 days ago

Garpez changed discussion status to open 12 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment