streaming

#5
by Okietrained - opened

after hosting with vllm how use it with openai api compatible streaming endpoint?

Good question, I am also looking onto it

@Okietrained you can add --hf-overrides '{"architectures":["Qwen3ASRRealtimeGeneration"]}' to vllm serve. This will add 'realtime' endpoint & you can use it as mistralai/Voxtral-Mini-4B-Realtime-2602.

Sign up or log in to comment