Vllm support

#15

by deece - opened Feb 28

Feb 28

Running the nightly VLLM docker images reports:
ValueError: GGUF model with architecture qwen35moe is not supported yet.

Running the nightly wheel gives the same error.

Running the same vllm build with Qwen/Qwen3.5... (safetensors) does work, so it looks like some work is needed before the provided GGUFs will work with VLLM.

dfz1

Feb 28

maybe it's the same problem as in https://huggingface.co/unsloth/Qwen3.5-27B-GGUF/discussions/12 ?

"The current Qwen3.5-27B-Q3_K_M.gguf uses qwen35 as the architecture name, but vLLM and Transformers expect qwen3_5 (matching the native model config.json). This mismatch causes a RuntimeError during loading."

JoshRobertsAIML

Mar 1

working great in vLLM nightly for me

id rec that or use lcpp, works great in that for me too, im on a 4060ti and 64 gigs of ram at half offload running 10 tok/s.

sampuran02

Mar 3

working great in vLLM nightly for me

id rec that or use lcpp, works great in that for me too, im on a 4060ti and 64 gigs of ram at half offload running 10 tok/s.

What did you do? i still am unable to get it working. Can you list some exact instructions?

gabbo1995

Mar 4

If you can, I would really appreciate an instruction as well, to use the gguf in vLLM. Thank you!

JoshRobertsAIML

Mar 7

all i do is run ./llama-server -m Qwen3.5-35B-A3B-Q3_K_M.gguf --host 0.0.0.0 --mmproj mmproj-BF16.gguf --n-gpu-layers 5

use the llama-server from llamacpps releases or just build it urself https://github.com/ggml-org/llama.cpp/releases/tag/b8230

srikantp

Mar 9

Can you please share the steps for vLLM? I still can't find gguf support for qwen3.5!

fred-chen

about 1 month ago

I have encountered the same problem, llama.cpp works fine with Qwen3.5-35B-A3B GGUF, but vllm doesn't: "ValueError: GGUF model with architecture qwen35moe is not supported yet."

vllm version: 0.17.2rc1.dev108+g4426447bb.d20260319
hardware: strix halo

vshmidt

28 days ago

vllm support is in progress https://github.com/vllm-project/vllm/pull/38140

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment