Image-Text-to-Text
GGUF
English
programming
code generation
images
image to text
qwen3_vl_text
Qwen3VLMoeForConditionalGeneration
video
code
coding
coder
chat
brainstorm
qwen
qwen3
qwencoder
NEO
NEO Imatrix
brainstorm 20x
all uses cases
creative
ggufs
MOE
Mixture of Experts
128 experts
conversational
Finetune 32b Qwen3vl instruct instead of thinking MoE
#3
by HAV0X1014 - opened
I like that there is progress made with the Qwen3VL models, but my use cases don't use thinking or reasoning in any capacity. Most other "RP" and "storywriting" models I've seen don't use thinking either. Also my VRAM constraints make anything much larger than 32b too big/slow.
I'm just saying that I'd like to see a model that uses the intelligence of Qwen 3 instruct as a base, incorporates vision, and adds the emotional intelligence of a DavidAU finetune :)
Hey;
As it stands the VL tuning is difficult to put it mildly.
Hopefully Transformers 5 and/or updates at Unsloth will make this better.
VLs (all sizes) have a lot of potential.