revision 94414ff broken?
#3
by raulalonsoctic - opened
I've been trying to run the latest weights on vLLM nightly and couldn't get it working properly. First issue is the tokenizer:
(APIServer pid=1) ValueError: Tokenizer class TokenizersBackend does not exist or is not currently imported.
Told vLLM to use the original tokenizer then got a lot of these warnings:
(Worker_TP2_EP2 pid=415) WARNING 02-22 11:45:38 [qwen3_5.py:507] Parameter layers.0.mlp.gate.weight not found in params_dict, skip loading
(Worker_TP3_EP3 pid=416) WARNING 02-22 11:45:38 [qwen3_5.py:507] Parameter layers.0.mlp.gate.weight not found in params_dict, skip loading
(Worker_TP1_EP1 pid=414) WARNING 02-22 11:45:40 [qwen3_5.py:507] Parameter layers.1.mlp.gate.weight not found in params_dict, skip loading
(Worker_TP2_EP2 pid=415) WARNING 02-22 11:45:40 [qwen3_5.py:507] Parameter layers.1.mlp.gate.weight not found in params_dict, skip loading
(Worker_TP0_EP0 pid=413) WARNING 02-22 11:45:40 [qwen3_5.py:507] Parameter layers.1.mlp.gate.weight not found in params_dict, skip loading
(Worker_TP3_EP3 pid=416) WARNING 02-22 11:45:40 [qwen3_5.py:507] Parameter layers.1.mlp.gate.weight not found in params_dict, skip loading
Reverting to f5c5cf8 for now
Not an issue with the quant but seems VLLM hasn't implemented TokenizersBackend, this is an easy fix. I will upload an updated tokenizer_config.json that works.
Config has been fixed in latest revision.