Language Mixing Problems
It giving very strong performance than openai gpt 5.2 but it have same language mixing like deepseek, if you guys give base model and instruction model with better multilingual capabilities would be really helpful for the community, as I wanted to have conversation in English but the model is mixing chinese characters in it
Thanks for the feedback — we’re glad to hear that M3 is performing well for your use case.
Regarding the language mixing issue, could you please share a concrete test case (prompt + system message, if any)? In particular, it would be helpful to know:
- Whether a Chinese system prompt or role instruction was used.
- Whether the user message contains any Chinese content or implicit hints for Chinese output.
One important note: demo space Baichuan-M3-Inquiry currently uses a Chinese system prompt by default, which can strongly bias the model toward Chinese or mixed-language responses. If you need purely English outputs, we recommend explicitly using an English system prompt (e.g., “You are a helpful assistant that always responds in English.”).
If you can provide a minimal reproducible example, we’re happy to further investigate and improve the multilingual behavior in future releases.
now it works, thanks a lot, can you guys release the datasets on which this ai model has been trained on? it would really help the community, looking for best medical data
Thanks for the question!
Baichuan-M3 is built on top of the Qwen3, so it inherits Qwen3’s pretraining knowledge cutoff, which is October 2024. This cutoff is determined by the time range of the large-scale pretraining corpus, so we didn't change it.
now it works, thanks a lot, can you guys release the datasets on which this ai model has been trained on? it would really help the community, looking for best medical data
Thanks for your interest! We really appreciate the enthusiasm from the community.
At the moment, we’re unfortunately not able to release the full training datasets used for Baichuan-M3. The post-training data involve a mixture of licensed sources, curated private corpora, and human-created annotations, which makes it legally and ethically difficult to open-source them directly.
That said, we fully agree that high-quality medical datasets are crucial for advancing the community. We are actively exploring:
Releasing evaluation benchmarks and task-specific test sets, like SCAN-bench, Hallucination Eval.
We hope these will still be useful for researchers and practitioners building medical AI systems. We’ll keep the community updated as soon as any of these become available.
