Auto language detection ?
Dear Cohere team,
I have tested this model and am really impressed with it. The WER is lower, and it can transcribe different languages and accents better than Whisper.
One thing I am wondering about: does this model support automatic language detection? I believe requiring a fixed language ID when integrating this model would significantly reduce its flexibility.
Hi - we didnt specifically train for auto language detection but we have noted your request to improve our future release.
While it doesnt work out of the box, you could hack it in some way. One method would be to let the model generate the language token by passing the partial input prompt and then later use that predicted language token to create the final prompt and use it again. The drawback is the increased latency due increased number of model call.