Update the weight file of Qwopus-FP8 version

#3
by MrChen-hero - opened

When I was deploying the FP8 version using vllm, I encountered some issues:

  1. Visual weight cannot be loaded
  2. During the deployment process, the visual files could not be identified. However, this issue was resolved when using the non-quantized version of v3.

I really hope to update a revised version of the FP8 model. This is very important for our laboratory resources. The current non-quantized version has almost consumed all the resources, leaving us no room for anything else.

Hi, thanks for the feedback!

Currently, FP8 multimodal support in Qwen3.5 is still a bit unstable, especially with vLLM. This is mainly because vLLM’s FP8 pipeline is not fully mature for multimodal components (e.g., vision encoder / projector), which can lead to issues like visual weights not loading or crashes with image inputs .

I’ll keep an eye on this and update if there’s a more stable solution.

Sign up or log in to comment