Request format for transformers serve for Qwen3.5

#7
by Crockrocks12 - opened

I am serving the Qwen 3.5 using transformers serve as mentioned in the document
Hugging Face Transformers
Hugging Face Transformers contains a lightweight server which can be used for quick testing and moderate load deployment. The latest transformers is required for Qwen3.5:

pip install "transformers[serving] @ git+https://github.com/huggingface/transformers.git@main"

See its documentation for more details. Please also make sure torchvision and pillow are installed.

Then, run transformers serve to launch a server with API endpoints at http://localhost:8000/v1; it will place the model on accelerators if available:

transformers serve --force-model Qwen/Qwen3.5-0.8B --port 8000 --continuous-batching

The model is getting served but when sending the request using Chat Completions API using the text input only method we are getting
openai.UnprocessableEntityError: Error code: 422 - {'detail': [{'type': 'missing', 'loc': ['query', 'request'], 'msg': 'Field required', 'input': None}]}

Any help would be appreciated .

Hi, I'm observing the same.

In my case partially solved by downgrading transformers to 5.2.0:
pip install "transformers[serving]==5.2.0"

"partially" - because I'm getting openai.UnprocessableEntityError: Error code: 422 - {'detail': "Unexpected keys in the request: {'top_k'}"}, had to comment out the top_k field in the request.

Sign up or log in to comment