YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

💡Model Description

Official model repository for our ACL 2026 Main Conference paper "Language on Demand, Knowledge at Core: Composing LLMs with Encoder-Decoder Translation Models for Extensible Multilinguality".

✨XBridge-base

XBridge-base is trained with stage 1 (cross-model alignment) using trilingual translation data, composing LLaMA3-8B with NLLB-200-1.3B. Training is conducted on 10 languages:

Bn, De, En, Es, Fr, Ja, Ru, Sw, Th, Zh

Despite being trained on a limited set of languages, we observe in our analysis that stage 1 learns a language-agnostic cross-model alignment, which generalizes well beyond the seen languages.

✨XBridge-SFT

XBridge-SFT further extends XBridge-base by training stage 2 (encoder-side adaptation) and stage 3 (decoder-side adaptation) for instruction-following tasks. Notably, we directly scale to 50 languages in these stages. This design is motivated by our finding of cross-model generalization. We train on the multilingual instruction-following dataset Bactrian-X, and expand to the following additional languages:

Af, Ar, Az, Cs, El, Et, Fa, Fi, Gl, Gu, He, Hi, Hr, Id, It, Ka, Kk, Km, Lt, Lv, Mk, Ml, Mn, Mr, My, Ne, Nl, Pl, Ps, Pt, Ro, Sl, Sv, Ta, Te, Tr, Uk, Ur, Vi, Xh

Empirically, we find that this direct scaling strategy achieves strong performance, demonstrating the robustness and generalization ability of the stage 1 alignment.

See our paper for more details, and try our Gradio demo in the github repository!

📚Citation

If you find this model or our work useful, please cite:

@misc{bu2026languagedemandknowledgecore,
      title={Language on Demand, Knowledge at Core: Composing LLMs with Encoder-Decoder Translation Models for Extensible Multilinguality}, 
      author={Mengyu Bu and Yang Feng},
      year={2026},
      eprint={2603.17512},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2603.17512}, 
}

📮Contact

For questions, please contact: bumengyu23z@ict.ac.cn

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including ICTNLP/XBridge-base

Paper for ICTNLP/XBridge-base