Cannot load the model in vLLM 0.17.1 (latest stable)

by Lukamodeo - opened Mar 12

Mar 12

I tryed to load this model in vLLM 0.17.1 (latest stable) having this error:
Tokenizer class TokenizersBackend does not exist or is not currently imported.

The original and not quantized Qwen/Qwen3.5-27B has this in its tokenizer_config.json:
"tokenizer_class": "Qwen2Tokenizer"

This Intel/Qwen3.5-27B-int4-AutoRound has this in its tokenizer_config.json:
"tokenizer_class": "TokenizersBackend"

But vLLM 0.17.1 requires: transformers >= 4.56.0, < 5 and new TokenizersBackend isn't managed

Xuehao

Intel org Mar 13

Please ignore vLLM requires and upgrade transformers to the latest version.

Lukamodeo

Mar 13

Thanks, but, for now, i tryed a simple solution.
I cloned the repo locally and replaced "TokenizersBackend" with "Qwen2Tokenizer" in the tokenizer_config.json... and vLLM 0.17.1 loaded the model without errors.

docato

13 days ago

Thanks, but, for now, i tryed a simple solution.
I cloned the repo locally and replaced "TokenizersBackend" with "Qwen2Tokenizer" in the tokenizer_config.json... and vLLM 0.17.1 loaded the model without errors.

Hey, did you try the model? I'm planning to install it for my organization. How's the performance?

Lukamodeo

13 days ago

Hello
For now i only tested it for raw performances (now with vLLM 0.19.0 + replaced "TokenizersBackend" with "Qwen2Tokenizer" in the tokenizer_config.json) and i get near 100 tokens/s on a L40s

docato

13 days ago

Hello
For now i only tested it for raw performances (now with vLLM 0.19.0 + replaced "TokenizersBackend" with "Qwen2Tokenizer" in the tokenizer_config.json) and i get near 100 tokens/s on a L40s

Very nice. What about the modal quality, does it satisfy you? What do you use it for, and how does it perform on these tasks? Thanks in advance.

Lukamodeo

12 days ago

Sorry, but i don't tested its quality for now, since i used Intel/Qwen3.5-27B-int4-AutoRound for real production tasks (on italian legal documents domain).
My initially experiments with 9B were for a very fast RAG (work in progress...) PoC.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment