NLLB-Distilled-600M: Kobani Kurdish β English
Fine-tuned version of facebook/nllb-200-distilled-600M for Kobani Kurdish β English neural machine translation (kob_Latn β eng_Latn).
This model adapts the multilingual NLLB-200 architecture (Cost et al., 2022) to support translation from the Kobani dialect of Kurmanji Kurdish β a severely under-resourced language variety with extremely limited parallel data and almost no representation in existing MT systems β into English.
Model Details
- Base model: facebook/nllb-200-distilled-600M
- Architecture: Encoder-decoder Transformer (6 encoder layers, 6 decoder layers, 600M parameters)
- New language token:
kob_Latn(custom token ID: 267756) - Forced BOS token during generation: 267756
- Training objective: Supervised sequence-to-sequence fine-tuning on parallel sentence pairs
- Direction: Kobani Kurdish (
kob_Latn) β English (eng_Latn)
Training Hyperparameters
Training was performed using the Hugging Face Trainer API with the following configuration:
- Number of epochs: 10
- Per-device train batch size: 4
- Gradient accumulation steps: 8
β Effective batch size: 32 - Learning rate: 3.0 Γ 10β»β΅
- Optimizer: AdamW (Ξ²β=0.9, Ξ²β=0.999, Ξ΅=1e-8)
- Warmup steps: 500
- Weight decay: 0.01
- Evaluation steps: 2000
- Save steps: 2000
- Maximum saved checkpoints: 3
- Early stopping: Patience = 3 (based on validation loss)
Training Data
The model was fine-tuned on parallel sentence, derived from aligned corpora with dialect-specific lexical and morphosyntactic perturbations.
Intended Use & Limitations
Intended use:
- Research in low-resource and dialectal machine translation
- Assistive translation support for Kobani Kurdish speakers
- Bootstrapping additional NLP resources for Kurmanji varieties
Known limitations:
- Performance heavily depends on domain similarity to training data
- Potential exposure bias and hallucination in out-of-domain text
- Zero-shot capability for Kobani remains limited without fine-tuning
- Early evaluations indicated possible data leakage; results should be interpreted cautiously until confirmed on fully isolated test sets
Citation
If you use this model in your research, please cite:
@misc{ahmad2026kobani-nllb-600m-reverse,
author = {Raman Ahmad},
title = {nllb-distilled-600m-kob_Latn-to-eng},
year = {2026},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
howpublished = {\url{https://huggingface.co/RamanAhmad/nllb-distilled-600m-kob_Latn-to-eng}}
}
Please also cite the original NLLB paper:
@article{cost2022nllb,
title = {No Language Left Behind: Scaling Human-Centered Machine Translation},
author = {NLLB Team and others},
journal = {arXiv preprint arXiv:2207.04672},
year = {2022},
doi = {10.48550/arXiv.2207.04672}
}
- Downloads last month
- 4