Thank you for the well-functioning uncensored quantization of Qwen3.5-35B-A3B

by dzupin - opened Feb 26

Feb 26

I just want to share that I am very pleased with how well Qwen3.5-35B-A3B-heretic-GGUF is working in llama.cpp. I am using Qwen3.5-35B-A3B-heretic.Q8_0.gguf and am very happy with its performance so far. No refusals and no model "brain-damage" detected in my testing.

matixxx

Feb 26

Why isn't it a Vision Model?
Could I, for example, add an mmproj-F32 file that would then work perfectly? Well, I'll test it 😁

dzupin

Feb 26

•

edited Feb 26

Yes, you can add mmproj-F32 and it will process images just fine. I just did it, and it works flawlessly
The one that works for me, is mmproj from https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF/tree/main

matixxx

Feb 26

Okay, thanks, I'll test that one; mine didn't work ... Failed to load model

matixxx

Feb 26

Great, it works now 😃
Why doesn't mradermacher include that from the start, perhaps even with their own config...?

matixxx

Feb 26

Unfortunately, the "heretic" was unsuccessful... i don't think it will happen that quickly.

dzupin

Feb 27

Not sure what caused Heretic to be unsuccessful in your case, but for me, the system is working fine. I can analyze any image input now with 0 refusals. Here is the command for llama-server that I am using and have confirmed works:
llama-server --no-mmap -ngl 999 --jinja -c 262144 --host 0.0.0.0 --port 5678 -fa 1 --model /AI/models/Qwen3.5-27B-heretic.Q8_0.gguf --mmproj /AI/models/Qwen3.5-27B-mmproj-F32.gguf

matixxx

Feb 27

During frame tagging, it did not tag correctly, which an older model had done.

IrisColt

Mar 17

During frame tagging, it did not tag correctly, which an older model had done.

I feel your pain...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment