streaming
#5
by Okietrained - opened
after hosting with vllm how use it with openai api compatible streaming endpoint?
Good question, I am also looking onto it
@Okietrained you can add --hf-overrides '{"architectures":["Qwen3ASRRealtimeGeneration"]}' to vllm serve. This will add 'realtime' endpoint & you can use it as mistralai/Voxtral-Mini-4B-Realtime-2602.