vLLM Support Query

by hrithiksagar-bgen - opened Sep 24, 2025

Sep 24, 2025

Dear team, is there vLLM support available for this model yet? @bluelike @littlebird13
Does this link only work? https://qwen.readthedocs.io/en/latest/deployment/vllm.html
like the libraries mentioned in this link?

dineshananthi

Sep 24, 2025

Check this - https://docs.vllm.ai/en/latest/models/supported_models.html

hrithiksagar-bgen

Sep 25, 2025

@dineshananthi in the link they only mentioned about QWEN 3 VL 4B and QWEN 3 VL 30B, but not about the 235B thinking or instruct.

@bluelike @littlebird13 Could you please help me with this?

https://github.com/QwenLM/Qwen3-VL?tab=readme-ov-file#online-serving:~:text=vllm.ai/nightly-,Online%20Serving,-You%20can%20start

I used the online serve code
Installation

pip install git+https://github.com/huggingface/transformers
pip install accelerate
pip install qwen-vl-utils==0.0.14
# pip install 'vllm>0.10.2' # If this is not working use the below one. 
uv pip install -U vllm \
    --torch-backend=auto \
    --extra-index-url https://wheels.vllm.ai/nightly

Online Serving
You can start either a vLLM or SGLang server to serve LLMs efficiently, and then access it using an OpenAI-style API.

vLLM server
# FP8 requires NVIDIA H100+ and CUDA 12+
python -m vllm.entrypoints.openai.api_server \
  --model Qwen/Qwen3-VL-235B-A22B-Instruct\
  --served-model-name Qwen/Qwen3-VL-235B-A22B-Instruct \
  --tensor-parallel-size 8 \
  --mm-encoder-tp-mode data \
  --enable-expert-parallel \
  --host 0.0.0.0 \
  --port 22002 \
  --dtype bfloat16 \
  --gpu-memory-utilization 0.70 \
  --quantization fp8 \
  --distributed-executor-backend mp

This code has worked but i am trying to get offline serve code to load on my own GPUs, thats when I am getting the issues, i believe vLLM has still not added the support, am i right?

hrithiksagar-bgen

Sep 26, 2025

The issue is now solved by me and merged a PR.

https://github.com/QwenLM/Qwen3-VL#:~:text=%7D%22)-,Offline%20Inference,-You%20can%20also

hrithiksagar-bgen changed discussion status to closed Sep 26, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment