No Language Left Behind: Scaling Human-Centered Machine Translation
Paper β’ 2207.04672 β’ Published β’ 3
Fine-tuned version of facebook/nllb-200-distilled-600M for English β Kobani Kurdish neural machine translation (eng_Latn β kob_Latn).
This model adapts the multilingual NLLB-200 architecture (Cost et al., 2022) to the Kobani dialect of Kurmanji Kurdish β a severely under-resourced language variety with extremely limited parallel data and almost no representation in existing MT systems.
kob_Latn (custom token ID: 267756)eng_Latn) β Kobani Kurdish (kob_Latn)Training was performed using the Hugging Face Trainer API with the following configuration:
The model was fine-tuned on parallel sentence, derived from aligned corpora with dialect-specific lexical and morphosyntactic perturbations.
Intended use:
Known limitations:
If you use this model in your research, please cite:
@misc{ahmad2026kobani-nllb-600m,
author = {Raman Ahmad},
title = {nllb-distilled-600m-eng-to-kob_Latn},
year = {2026},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
howpublished = {\url{https://huggingface.co/RamanAhmad/nllb-distilled-600m-eng-to-kob_Latn}}
}
Please also cite the original NLLB paper:
@article{cost2022nllb,
title = {No Language Left Behind: Scaling Human-Centered Machine Translation},
author = {NLLB Team and others},
journal = {arXiv preprint arXiv:2207.04672},
year = {2022},
doi = {10.48550/arXiv.2207.04672}
}