Model NotFoundError
[serving.py:200]Errorwith model error=ErrorInfo(message='The model claude-haiku-4-5-20251001*does not exist.'type='NotFoundError'param=model',code=404)
Why this error in vllm?
Hello!
It looks like you're trying to load claude-haiku-4-5-20251001, which is not a valid local model in vLLM...
e.g.:
Jackrong/Qwopus3.5-27B-v3π
This is my sh to start vllm:
MODEL_PATH=/ckpt/Qwopus3.5-27B-v3
TOKENIZER_PATH=/ckpt/Qwen3.5-27B
SERVER_NAME=qwopus3.5-27b-v3
docker run --gpus '"device=2,5"' --ipc=host -p 8000:8000
-v $MODEL_PATH:/model
-v $TOKENIZER_PATH:/tokenizer
vllm-qwopus
--model /model
--served-model-name qwopus3.5-27b-v3
--host 0.0.0.0 --port 8000
--tensor-parallel-size 2
--dtype bfloat16
--gpu-memory-utilization 0.9
--max-model-len 262144
--load-format safetensors
--disable-custom-all-reduce
--enable-prefix-caching
--chat-template-content-format string
--enable-auto-tool-choice
--tokenizer /tokenizer
--reasoning-parser qwen3
--tool-call-parser qwen3_coder
Are there any mistakes?
Many thanks!