how to run 4bit model on vLLM?

#8
by selmee - opened

I'd appreciate it if you could provide a guideline to run Qwen3.5-27B 4bit model using vLLM

able to run using llama.cpp, impressive result. thanks. looking forward to vLLM support

selmee changed discussion status to closed

Sign up or log in to comment