Update the weight file of Qwopus-FP8 version
#3
by MrChen-hero - opened
When I was deploying the FP8 version using vllm, I encountered some issues:
- Visual weight cannot be loaded
- During the deployment process, the visual files could not be identified. However, this issue was resolved when using the non-quantized version of v3.
I really hope to update a revised version of the FP8 model. This is very important for our laboratory resources. The current non-quantized version has almost consumed all the resources, leaving us no room for anything else.
Hi, thanks for the feedback!
Currently, FP8 multimodal support in Qwen3.5 is still a bit unstable, especially with vLLM. This is mainly because vLLM’s FP8 pipeline is not fully mature for multimodal components (e.g., vision encoder / projector), which can lead to issues like visual weights not loading or crashes with image inputs .
I’ll keep an eye on this and update if there’s a more stable solution.