Re-quantize is required for future version
#2
by hell0ks - opened
Hello, thanks for the amazing work!
It seems like latest Qwen3-VL PR for llama.cpp changed gguf format for some reason (likely to comply with visual accuracy). I think this change will be merged soon and this quant will not work.
Could you do gguf conversion again with this PR? Thanks again.
Thanks for the great work!
The PR is now merged in llama.cpp upstream, and this quant does not work (llama.cpp/tools/mtmd/clip.cpp:859: GGML_ASSERT(model.patch_bias != nullptr) failed). So it does look like we need a re quantization.
Also, is there any chance for an abliterated thinking version of qwen3-vl-235b?
Thank you.
hell0ks changed discussion status to closed