KeyError: 'mistral4' running vllm-ms4

by TimKoornstra - opened Mar 17

•

Running with image mistralllm/vllm-ms4:latest gives me the following stack trace:

(APIServer pid=50) Traceback (most recent call last):
(APIServer pid=50)   File "/usr/local/bin/vllm", line 6, in <module>
(APIServer pid=50)     sys.exit(main())
(APIServer pid=50)              ^^^^^^
(APIServer pid=50)   File "/workspace/vllm/vllm/entrypoints/cli/main.py", line 75, in main
(APIServer pid=50)     args.dispatch_function(args)
(APIServer pid=50)   File "/workspace/vllm/vllm/entrypoints/cli/serve.py", line 118, in cmd
(APIServer pid=50)     uvloop.run(run_server(args))
(APIServer pid=50)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=50)     return __asyncio.run(
(APIServer pid=50)            ^^^^^^^^^^^^^^
(APIServer pid=50)   File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=50)     return runner.run(main)
(APIServer pid=50)            ^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=50)     return self._loop.run_until_complete(task)
(APIServer pid=50)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=50)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=50)     return await main
(APIServer pid=50)            ^^^^^^^^^^
(APIServer pid=50)   File "/workspace/vllm/vllm/entrypoints/openai/api_server.py", line 653, in run_server
(APIServer pid=50)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=50)   File "/workspace/vllm/vllm/entrypoints/openai/api_server.py", line 667, in run_server_worker
(APIServer pid=50)     async with build_async_engine_client(
(APIServer pid=50)                ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=50)     return await anext(self.gen)
(APIServer pid=50)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/workspace/vllm/vllm/entrypoints/openai/api_server.py", line 101, in build_async_engine_client
(APIServer pid=50)     async with build_async_engine_client_from_engine_args(
(APIServer pid=50)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=50)     return await anext(self.gen)
(APIServer pid=50)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/workspace/vllm/vllm/entrypoints/openai/api_server.py", line 127, in build_async_engine_client_from_engine_args
(APIServer pid=50)     vllm_config = engine_args.create_engine_config(usage_context=usage_context)
(APIServer pid=50)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/workspace/vllm/vllm/engine/arg_utils.py", line 1495, in create_engine_config
(APIServer pid=50)     model_config = self.create_model_config()
(APIServer pid=50)                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/workspace/vllm/vllm/engine/arg_utils.py", line 1347, in create_model_config
(APIServer pid=50)     return ModelConfig(
(APIServer pid=50)            ^^^^^^^^^^^^
(APIServer pid=50)   File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 121, in __init__
(APIServer pid=50)     s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
(APIServer pid=50)   File "/workspace/vllm/vllm/config/model.py", line 476, in __post_init__
(APIServer pid=50)     hf_config = get_config(
(APIServer pid=50)                 ^^^^^^^^^^^
(APIServer pid=50)   File "/workspace/vllm/vllm/transformers_utils/config.py", line 643, in get_config
(APIServer pid=50)     config_dict, config = config_parser.parse(
(APIServer pid=50)                           ^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/workspace/vllm/vllm/transformers_utils/config.py", line 189, in parse
(APIServer pid=50)     config = AutoConfig.from_pretrained(
(APIServer pid=50)              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/usr/local/lib/python3.12/dist-packages/transformers/models/auto/configuration_auto.py", line 1484, in from_pretrained
(APIServer pid=50)     return config_class.from_dict(config_dict, **unused_kwargs)
(APIServer pid=50)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/usr/local/lib/python3.12/dist-packages/transformers/configuration_utils.py", line 757, in from_dict
(APIServer pid=50)     config = cls(**config_dict)
(APIServer pid=50)              ^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/usr/local/lib/python3.12/dist-packages/huggingface_hub/dataclasses.py", line 279, in init_with_validate
(APIServer pid=50)     initial_init(self, *args, **kwargs)  # type: ignore [call-arg]
(APIServer pid=50)     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/usr/local/lib/python3.12/dist-packages/huggingface_hub/dataclasses.py", line 194, in __init__
(APIServer pid=50)     self.__post_init__(**additional_kwargs)
(APIServer pid=50)   File "/usr/local/lib/python3.12/dist-packages/transformers/models/mistral3/configuration_mistral3.py", line 84, in __post_init__
(APIServer pid=50)     self.text_config = CONFIG_MAPPING[self.text_config["model_type"]](**self.text_config)
(APIServer pid=50)                        ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=50)   File "/usr/local/lib/python3.12/dist-packages/transformers/models/auto/configuration_auto.py", line 1175, in __getitem__
(APIServer pid=50)     raise KeyError(key)
(APIServer pid=50) KeyError: 'mistral4'

Seems like a configuration issue?

TimKoornstra changed discussion title from KeyError: 'mistral4' to KeyError: 'mistral4' running vllm-ms4 Mar 17

patrickvonplaten

Mistral AI_ org Mar 17

Checking!

patrickvonplaten

Mistral AI_ org Mar 17

Hmm I cannot reproduce this - everything works as expected for me with this command:

vllm serve mistralai/Mistral-Small-4-119B-2603 --max-model-len 262144 --tensor-parallel-size 2 --attention-backend FLASH_ATTN_MLA \
  --tool-call-parser mistral --enable-auto-tool-choice --reasoning-parser mistral --max_num_batched_tokens 16384 --max_num_seqs 128 \
  --gpu_memory_utilization 0.8

Could it be that you overwrote the default transformers version of this docker?

patrickvonplaten

Mistral AI_ org Mar 17

Transformers version should be 5.3.0.dev

TimKoornstra

Mar 17

Redownloaded the model and now its working fine, thanks!

TimKoornstra changed discussion status to closed Mar 17

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment