Re-quantize is required for future version

by hell0ks - opened Oct 26, 2025

Oct 26, 2025

Hello, thanks for the amazing work!

It seems like latest Qwen3-VL PR for llama.cpp changed gguf format for some reason (likely to comply with visual accuracy). I think this change will be merged soon and this quant will not work.

Could you do gguf conversion again with this PR? Thanks again.

huihui-ai

Owner Oct 27, 2025

I will test with tr-qwen3-vl-6-b7106-495c611 to see if an update is needed.

jimreynold2nd

Oct 31, 2025

•

edited Oct 31, 2025

Thanks for the great work!

The PR is now merged in llama.cpp upstream, and this quant does not work (llama.cpp/tools/mtmd/clip.cpp:859: GGML_ASSERT(model.patch_bias != nullptr) failed). So it does look like we need a re quantization.

Also, is there any chance for an abliterated thinking version of qwen3-vl-235b?

huihui-ai

Owner Nov 2, 2025

It has been updated to the official llama.cpp-b6907.

hell0ks

Nov 2, 2025

Thank you.

hell0ks changed discussion status to closed Nov 2, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment