Segmentation Fault in cohere-transcribe Model with vLLM Audio Transcription

#20

by kunalchamoli - opened 21 days ago

I'm encountering a segmentation fault when running the CohereLabs/cohere-transcribe-03-2026 model with vLLM's audio transcription endpoint.

Environment:

vLLM Version: Latest nightly build
Model: CohereLabs/cohere-transcribe-03-2026
Python Version: 3.11
CUDA Version: 12.4.1
cuDNN: 12.4.1
GPU: NVIDIA A100-SXM4-40GB

Setup:

Running in Docker container (nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04)
vLLM installed with [audio] extra: pip install "vllm[audio]"
Audio processing dependencies: librosa, transformers>=5.4.0

Configuration:

VLLM_MAX_MODEL_LEN=1024
VLLM_GPU_MEMORY_UTILIZATION=0.80
VLLM_DTYPE=auto

Error Description:

The EngineCore process crashes with a segmentation fault during /v1/audio/transcriptions API calls. The crash occurs deep in the Python evaluation stack with no usable traceback information.

Error Messages:

WARNING: Defaulting to language='en'. If you wish to transcribe audio in a different language, pass the `language` field in the TranscriptionRequest.

!!!!!!! Segfault encountered !!!!!!!
  File "<unknown>", line 0, in _PyEval_EvalFrameDefault
  [... stack frames omitted ...]
  File "<unknown>", line 0, in _start

ERROR: Engine core proc EngineCore died unexpectedly, shutting down client.
ERROR: AsyncLLM output_handler failed.
vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue.

ekagra-ranjan

Cohere Labs org 21 days ago

Hello,

can you share the exact cmd you used to start the server and the audio file along with the cmd to send the request so we can repro?
Does it happen for all kinds of audio files OR some specific one which you tried?

Fuzoristic

20 days ago

nice one

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment