Thank you for the well-functioning uncensored quantization of Qwen3.5-35B-A3B
I just want to share that I am very pleased with how well Qwen3.5-35B-A3B-heretic-GGUF is working in llama.cpp. I am using Qwen3.5-35B-A3B-heretic.Q8_0.gguf and am very happy with its performance so far. No refusals and no model "brain-damage" detected in my testing.
Why isn't it a Vision Model?
Could I, for example, add an mmproj-F32 file that would then work perfectly? Well, I'll test it π
Yes, you can add mmproj-F32 and it will process images just fine. I just did it, and it works flawlessly
The one that works for me, is mmproj from https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF/tree/main
Okay, thanks, I'll test that one; mine didn't work ... Failed to load model
Great, it works now π
Why doesn't mradermacher include that from the start, perhaps even with their own config...?
Unfortunately, the "heretic" was unsuccessful... i don't think it will happen that quickly.
Not sure what caused Heretic to be unsuccessful in your case, but for me, the system is working fine. I can analyze any image input now with 0 refusals. Here is the command for llama-server that I am using and have confirmed works:
llama-server --no-mmap -ngl 999 --jinja -c 262144 --host 0.0.0.0 --port 5678 -fa 1 --model /AI/models/Qwen3.5-27B-heretic.Q8_0.gguf --mmproj /AI/models/Qwen3.5-27B-mmproj-F32.gguf
During frame tagging, it did not tag correctly, which an older model had done.
During frame tagging, it did not tag correctly, which an older model had done.
I feel your pain...