YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
💡Model Description
Official model repository for our ACL 2026 Main Conference paper "Language on Demand, Knowledge at Core: Composing LLMs with Encoder-Decoder Translation Models for Extensible Multilinguality".
✨XBridge-base
XBridge-base is trained with stage 1 (cross-model alignment) using trilingual translation data, composing LLaMA3-8B with NLLB-200-1.3B. Training is conducted on 10 languages:
Bn, De, En, Es, Fr, Ja, Ru, Sw, Th, Zh
Despite being trained on a limited set of languages, we observe in our analysis that stage 1 learns a language-agnostic cross-model alignment, which generalizes well beyond the seen languages.
✨XBridge-SFT
XBridge-SFT further extends XBridge-base by training stage 2 (encoder-side adaptation) and stage 3 (decoder-side adaptation) for instruction-following tasks. Notably, we directly scale to 50 languages in these stages. This design is motivated by our finding of cross-model generalization. We train on the multilingual instruction-following dataset Bactrian-X, and expand to the following additional languages:
Af, Ar, Az, Cs, El, Et, Fa, Fi, Gl, Gu, He, Hi, Hr, Id, It, Ka, Kk, Km, Lt, Lv, Mk, Ml, Mn, Mr, My, Ne, Nl, Pl, Ps, Pt, Ro, Sl, Sv, Ta, Te, Tr, Uk, Ur, Vi, Xh
Empirically, we find that this direct scaling strategy achieves strong performance, demonstrating the robustness and generalization ability of the stage 1 alignment.
See our paper for more details, and try our Gradio demo in the github repository!
📚Citation
If you find this model or our work useful, please cite:
@misc{bu2026languagedemandknowledgecore,
title={Language on Demand, Knowledge at Core: Composing LLMs with Encoder-Decoder Translation Models for Extensible Multilinguality},
author={Mengyu Bu and Yang Feng},
year={2026},
eprint={2603.17512},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2603.17512},
}
📮Contact
For questions, please contact: bumengyu23z@ict.ac.cn
- Downloads last month
- -