Update installation instructions
Couldn't get it work when installing vllm before vllm-omni encountering this:
ImportError: /mnt/storage/workspaces/voxtral-tts/.venv/lib/python3.10/site-packages/vllm/_C.abi3.so: undefined symbol: _ZN3c1013MessageLoggerC1EPKciib
Everything good if installing vllm-omni before vllm (at least when using python 3.10)
By the way, incredible model, thanks for offering it to the community <3
Updated them with latest version of vllm-omni - can you try again? :-)
Hey @patrickvonplaten I don't understand, how the latest pypi version could be in advance compared to vllm-omni main at github?
Anyway, I'll try again a bit later today and let you know of course
Nope, still the issue, all the command ran in a row:
/m/s/w/Voxtral-4B-TTS-2603 ❯❯❯ uv venv .venv --python=3.10
Using CPython 3.10.19
Creating virtual environment at: .venv
Activate with: source .venv/bin/activate
/m/s/w/Voxtral-4B-TTS-2603 ❯❯❯ source .venv/bin/activate.fish
(.venv) /m/s/w/Voxtral-4B-TTS-2603 ❯❯❯ python --version
Python 3.10.19
(.venv) /m/s/w/Voxtral-4B-TTS-2603 ❯❯❯ uv pip install -U vllm
Resolved 175 packages in 367ms
Prepared 175 packages in 6ms
Installed 175 packages in 249ms
+ aiohappyeyeballs==2.6.1
+ aiohttp==3.13.4
+ aiosignal==1.4.0
+ annotated-doc==0.0.4
+ annotated-types==0.7.0
+ anthropic==0.86.0
+ anyio==4.13.0
+ apache-tvm-ffi==0.1.9
+ astor==0.8.1
+ async-timeout==5.0.1
+ attrs==26.1.0
+ blake3==1.0.8
+ cachetools==7.0.5
+ cbor2==5.9.0
+ certifi==2026.2.25
+ cffi==2.0.0
+ charset-normalizer==3.4.6
+ click==8.3.1
+ cloudpickle==3.1.2
+ compressed-tensors==0.13.0
+ cryptography==46.0.6
+ cuda-bindings==12.9.4
+ cuda-pathfinder==1.5.0
+ cuda-python==12.9.4
+ depyf==0.20.0
+ dill==0.4.1
+ diskcache==5.6.3
+ distro==1.9.0
+ dnspython==2.8.0
+ docstring-parser==0.17.0
+ einops==0.8.2
+ email-validator==2.3.0
+ exceptiongroup==1.3.1
+ fastapi==0.135.2
+ fastapi-cli==0.0.24
+ fastapi-cloud-cli==0.15.1
+ fastar==0.9.0
+ filelock==3.25.2
+ flashinfer-python==0.6.6
+ frozenlist==1.8.0
+ fsspec==2026.3.0
+ gguf==0.18.0
+ googleapis-common-protos==1.73.1
+ grpcio==1.80.0
+ h11==0.16.0
+ hf-xet==1.4.2
+ httpcore==1.0.9
+ httptools==0.7.1
+ httpx==0.28.1
+ httpx-sse==0.4.3
+ huggingface-hub==0.36.2
+ idna==3.11
+ ijson==3.5.0
+ importlib-metadata==8.7.1
+ interegular==0.3.3
+ jinja2==3.1.6
+ jiter==0.13.0
+ jmespath==1.1.0
+ jsonschema==4.26.0
+ jsonschema-specifications==2025.9.1
+ lark==1.2.2
+ llguidance==1.3.0
+ llvmlite==0.44.0
+ lm-format-enforcer==0.11.3
+ loguru==0.7.3
+ markdown-it-py==4.0.0
+ markupsafe==3.0.3
+ mcp==1.26.0
+ mdurl==0.1.2
+ mistral-common==1.10.0
+ model-hosting-container-standards==0.1.14
+ mpmath==1.3.0
+ msgspec==0.20.0
+ multidict==6.7.1
+ networkx==3.4.2
+ ninja==1.13.0
+ numba==0.61.2
+ numpy==2.2.6
+ nvidia-cublas-cu12==12.8.4.1
+ nvidia-cuda-cupti-cu12==12.8.90
+ nvidia-cuda-nvrtc-cu12==12.8.93
+ nvidia-cuda-runtime-cu12==12.8.90
+ nvidia-cudnn-cu12==9.10.2.21
+ nvidia-cudnn-frontend==1.18.0
+ nvidia-cufft-cu12==11.3.3.83
+ nvidia-cufile-cu12==1.13.1.3
+ nvidia-curand-cu12==10.3.9.90
+ nvidia-cusolver-cu12==11.7.3.90
+ nvidia-cusparse-cu12==12.5.8.93
+ nvidia-cusparselt-cu12==0.7.1
+ nvidia-cutlass-dsl==4.4.2
+ nvidia-cutlass-dsl-libs-base==4.4.2
+ nvidia-ml-py==13.595.45
+ nvidia-nccl-cu12==2.27.5
+ nvidia-nvjitlink-cu12==12.8.93
+ nvidia-nvshmem-cu12==3.4.5
+ nvidia-nvtx-cu12==12.8.90
+ openai==2.24.0
+ openai-harmony==0.0.8
+ opencv-python-headless==4.13.0.92
+ opentelemetry-api==1.40.0
+ opentelemetry-exporter-otlp==1.40.0
+ opentelemetry-exporter-otlp-proto-common==1.40.0
+ opentelemetry-exporter-otlp-proto-grpc==1.40.0
+ opentelemetry-exporter-otlp-proto-http==1.40.0
+ opentelemetry-proto==1.40.0
+ opentelemetry-sdk==1.40.0
+ opentelemetry-semantic-conventions==0.61b0
+ opentelemetry-semantic-conventions-ai==0.5.1
+ outlines-core==0.2.11
+ packaging==26.0
+ partial-json-parser==0.2.1.1.post7
+ pillow==12.1.1
+ prometheus-client==0.24.1
+ prometheus-fastapi-instrumentator==7.1.0
+ propcache==0.4.1
+ protobuf==6.33.6
+ psutil==7.2.2
+ py-cpuinfo==9.0.0
+ pybase64==1.4.3
+ pycountry==26.2.16
+ pycparser==3.0
+ pydantic==2.12.5
+ pydantic-core==2.41.5
+ pydantic-extra-types==2.11.1
+ pydantic-settings==2.13.1
+ pygments==2.20.0
+ pyjwt==2.12.1
+ python-dotenv==1.2.2
+ python-json-logger==4.1.0
+ python-multipart==0.0.22
+ pyyaml==6.0.3
+ pyzmq==27.1.0
+ quack-kernels==0.3.7
+ referencing==0.37.0
+ regex==2026.3.32
+ requests==2.33.1
+ rich==14.3.3
+ rich-toolkit==0.19.7
+ rignore==0.7.6
+ rpds-py==0.30.0
+ safetensors==0.7.0
+ sentencepiece==0.2.1
+ sentry-sdk==2.57.0
+ setproctitle==1.3.7
+ setuptools==82.0.1
+ shellingham==1.5.4
+ sniffio==1.3.1
+ sse-starlette==3.3.4
+ starlette==0.52.1
+ supervisor==4.3.0
+ sympy==1.14.0
+ tabulate==0.10.0
+ tiktoken==0.12.0
+ tokenizers==0.22.2
+ tomli==2.4.1
+ torch==2.10.0
+ torch-c-dlpack-ext==0.1.5
+ torchaudio==2.10.0
+ torchvision==0.25.0
+ tqdm==4.67.3
+ transformers==4.57.6
+ triton==3.6.0
+ typer==0.24.1
+ typing-extensions==4.15.0
+ typing-inspection==0.4.2
+ urllib3==2.6.3
+ uvicorn==0.42.0
+ uvloop==0.22.1
+ vllm==0.18.1
+ watchfiles==1.1.1
+ websockets==16.0
+ xgrammar==0.1.33
+ yarl==1.23.0
+ zipp==3.23.0
(.venv) /m/s/w/Voxtral-4B-TTS-2603 ❯❯❯ uv pip install vllm-omni --upgrade
Resolved 140 packages in 257ms
Prepared 89 packages in 3ms
Uninstalled 13 packages in 159ms
Installed 89 packages in 253ms
+ accelerate==1.12.0
+ aenum==3.1.16
+ aiofiles==24.1.0
+ antlr4-python3-runtime==4.9.3
+ audioread==3.1.0
+ brotli==1.2.0
+ cache-dit==1.3.0
+ coloredlogs==15.0.1
- cuda-bindings==12.9.4
+ cuda-bindings==13.2.0
+ cuda-toolkit==13.0.2
+ decorator==5.2.1
+ diffusers==0.37.1
+ einx==0.4.2
+ ema-pytorch==0.7.9
+ fa3-fwd==0.0.2
+ ffmpy==1.0.0
+ fire==0.7.1
+ flatbuffers==25.12.19
+ frozendict==2.4.7
+ gradio==5.50.0
+ gradio-client==1.14.0
+ groovy==0.1.2
- huggingface-hub==0.36.2
+ huggingface-hub==1.8.0
+ humanfriendly==10.0
+ imageio==2.37.3
+ imageio-ffmpeg==0.6.0
- importlib-metadata==8.7.1
+ importlib-metadata==9.0.0
+ janus==2.0.0
+ joblib==1.5.3
+ lazy-loader==0.5
+ librosa==0.11.0
- llvmlite==0.44.0
+ llvmlite==0.46.0
+ more-itertools==10.8.0
+ msgpack==1.1.2
- numba==0.61.2
+ numba==0.64.0
+ nvidia-cublas==13.1.0.3
+ nvidia-cuda-cupti==13.0.85
+ nvidia-cuda-nvrtc==13.0.88
+ nvidia-cuda-runtime==13.0.96
+ nvidia-cudnn-cu13==9.19.0.56
+ nvidia-cufft==12.0.0.61
+ nvidia-cufile==1.15.1.6
+ nvidia-curand==10.4.0.35
+ nvidia-cusolver==12.0.4.66
+ nvidia-cusparse==12.6.3.3
+ nvidia-cusparselt-cu13==0.8.0
+ nvidia-nccl-cu13==2.28.9
+ nvidia-nvjitlink==13.0.88
+ nvidia-nvshmem-cu13==3.4.5
+ nvidia-nvtx==13.0.85
+ omegaconf==2.3.0
+ onnxruntime==1.23.2
+ openai-whisper==20250625
+ orjson==3.11.7
+ pandas==2.3.3
- pillow==12.1.1
+ pillow==11.3.0
+ platformdirs==4.9.4
+ pooch==1.9.0
+ prettytable==3.17.0
- protobuf==6.33.6
+ protobuf==7.34.1
- pydantic==2.12.5
+ pydantic==2.12.3
- pydantic-core==2.41.5
+ pydantic-core==2.41.4
+ pydub==0.25.1
+ python-dateutil==2.9.0.post0
+ python-fire==0.1.0
+ pytz==2026.1.post1
+ resampy==0.4.3
+ ruff==0.15.8
+ safehttpx==0.1.7
+ scikit-learn==1.7.2
+ scipy==1.15.3
+ semantic-version==2.10.0
- setuptools==82.0.1
+ setuptools==81.0.0
+ six==1.17.0
+ soundfile==0.13.1
+ sox==1.5.0
+ soxr==1.0.0
+ termcolor==3.3.0
+ threadpoolctl==3.6.0
+ tomlkit==0.13.3
- torch==2.10.0
+ torch==2.11.0
+ torchsde==0.2.6
+ trampoline==0.1.2
- transformers==4.57.6
+ transformers==5.4.0
+ tzdata==2025.3
+ vllm-omni==0.18.0
+ wcwidth==0.6.0
- websockets==16.0
+ websockets==15.0.1
+ x-transformers==2.17.9
(.venv) /m/s/w/Voxtral-4B-TTS-2603 ❯❯❯ python3 -c "import mistral_common; print(mistral_common.__version__)"
1.10.0
(.venv) /m/s/w/Voxtral-4B-TTS-2603 ❯❯❯ vllm serve mistralai/Voxtral-4B-TTS-2603 --omni
Traceback (most recent call last):
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/bin/vllm", line 4, in <module>
from vllm_omni.entrypoints.cli.main import main
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm_omni/__init__.py", line 16, in <module>
from . import patch # noqa: F401
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm_omni/patch.py", line 5, in <module>
from vllm.model_executor.layers.rotary_embedding import (
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm/model_executor/__init__.py", line 4, in <module>
from vllm.model_executor.parameter import BasevLLMParameter, PackedvLLMParameter
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm/model_executor/parameter.py", line 11, in <module>
from vllm.distributed import (
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm/distributed/__init__.py", line 4, in <module>
from .communication_op import *
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm/distributed/communication_op.py", line 9, in <module>
from .parallel_state import get_tp_group
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm/distributed/parallel_state.py", line 49, in <module>
from vllm.distributed.utils import StatelessProcessGroup
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm/distributed/utils.py", line 33, in <module>
from vllm.utils.system_utils import suppress_stdout
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm/utils/system_utils.py", line 19, in <module>
from vllm.platforms import current_platform
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm/platforms/__init__.py", line 279, in __getattr__
_current_platform = resolve_obj_by_qualname(platform_cls_qualname)()
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm/utils/import_utils.py", line 111, in resolve_obj_by_qualname
module = importlib.import_module(module_name)
File "/home/user/.local/share/uv/python/cpython-3.10.19-linux-x86_64-gnu/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm/platforms/cuda.py", line 19, in <module>
import vllm._C # noqa
ImportError: /mnt/storage/workspaces/Voxtral-4B-TTS-2603/.venv/lib/python3.10/site-packages/vllm/_C.abi3.so: undefined symbol: _ZN3c1013MessageLoggerC1EPKciib
Alright my whole process was a mess I couldn't remember exactly what uv pip command I ran, cause at some point I also tried to reinstall torch from https://download.pytorch.org/whl/torch/ etc... I was a bit desperate lol
So now I have a simple, working workflow that works:
uv venv .venv --python=3.10
source .venv/bin/activate
uv pip install vllm
uv pip install vllm-omni
vllm serve mistralai/Voxtral-4B-TTS-2603 --omni
So, basically just removed the --upgrade for uv pip install vllm-omni
The --upgrade was modifying those:
- cuda-bindings==12.9.4
+ cuda-bindings==13.2.0
- huggingface-hub==0.36.2
+ huggingface-hub==1.8.0
- importlib-metadata==8.7.1
+ importlib-metadata==9.0.0
- llvmlite==0.44.0
+ llvmlite==0.46.0
- numba==0.61.2
+ numba==0.64.0
- numpy==2.2.6
+ numpy==2.4.4
- pillow==12.1.1
+ pillow==11.3.0
- protobuf==6.33.6
+ protobuf==7.34.1
- pydantic==2.12.5
+ pydantic==2.12.3
- pydantic-core==2.41.5
+ pydantic-core==2.41.4
- setuptools==80.10.2
+ setuptools==81.0.0
- torch==2.10.0
+ torch==2.11.0
- transformers==4.57.6
+ transformers==5.4.0
- websockets==16.0
+ websockets==15.0.1
Found it!
The issue is coming from
- torch==2.10.0
+ torch==2.11.0
reinstalling torch with uv pip install torch==2.10.0 makes it work!
Sorry for the mess, for some reason I miss pasted what the --upgrade was modifying! Edited my previous answer!