Safetensors
llama
File size: 2,402 Bytes
a845bf4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# 💡Model Description

Official model repository for our **ACL 2026 Main Conference** paper "*Language on Demand, Knowledge at Core*: Composing LLMs with Encoder-Decoder Translation Models for Extensible Multilinguality".

## ✨XBridge-base

[`XBridge-base`](https://huggingface.co/ICTNLP/XBridge-base) is trained with stage 1 (cross-model alignment) using trilingual translation data, composing [`LLaMA3-8B`](https://huggingface.co/meta-llama/Meta-Llama-3-8B) with [`NLLB-200-1.3B`](https://huggingface.co/facebook/nllb-200-1.3B). Training is conducted on 10 languages:

> Bn, De, En, Es, Fr, Ja, Ru, Sw, Th, Zh

Despite being trained on a limited set of languages, we observe in our analysis that **stage 1 learns a language-agnostic cross-model alignment**, which generalizes well beyond the seen languages.

## ✨XBridge-SFT

[`XBridge-SFT`](https://huggingface.co/ICTNLP/XBridge-SFT) further extends `XBridge-base` by training stage 2 (encoder-side adaptation) and stage 3 (decoder-side adaptation) for instruction-following tasks. Notably, we directly scale to 50 languages in these stages. This design is motivated by our finding of cross-model generalization. We train on the multilingual instruction-following dataset [`Bactrian-X`](https://huggingface.co/datasets/MBZUAI/Bactrian-X), and expand to the following additional languages:

> Af, Ar, Az, Cs, El, Et, Fa, Fi, Gl, Gu, He, Hi, Hr, Id, It, Ka, Kk, Km, Lt, Lv, Mk, Ml, Mn, Mr, My, Ne, Nl, Pl, Ps, Pt, Ro, Sl, Sv, Ta, Te, Tr, Uk, Ur, Vi, Xh

Empirically, we find that this direct scaling strategy achieves strong performance, demonstrating the robustness and generalization ability of the stage 1 alignment.

See our [paper](https://arxiv.org/abs/2603.17512) for more details, and try our Gradio demo in the [github repository](https://github.com/ictnlp/XBridge)!

# 📚Citation

If you find this model or our work useful, please cite:

```tex

@misc{bu2026languagedemandknowledgecore,

      title={Language on Demand, Knowledge at Core: Composing LLMs with Encoder-Decoder Translation Models for Extensible Multilinguality}, 

      author={Mengyu Bu and Yang Feng},

      year={2026},

      eprint={2603.17512},

      archivePrefix={arXiv},

      primaryClass={cs.CL},

      url={https://arxiv.org/abs/2603.17512}, 

}

```

# 📮Contact

For questions, please contact: `bumengyu23z@ict.ac.cn`