NLLB-200 600M - Kalenjin to Swahili (LoRA Adapter)

This is a rigorously fine-tuned LoRA adapter for facebook/nllb-200-distilled-600M, heavily optimized for translating Kalenjin (KLN) to Swahili (SWA).

📊 Evaluation Results (KLN -> SWA)

BLEU Score: 57.24
chrF Score: 70.53

🛠️ Technical Details & Training

Hardware: Trained locally (42 Epochs) on 5060 8GB GPU for 13 hours.
LoRA Config: r=64, alpha=128, targeting ["q_proj", "v_proj", "k_proj", "out_proj", "fc1", "fc2"].
Token Strategy: Kalenjin uses "token hijacking" and routes through the luo_Latn token space to prevent catastrophic forgetting that comes with initializing a raw token.

💻 Usage

import torch
from peft import PeftModel
from transformers import AutoModelForSeq2SeqLM, NllbTokenizerFast

# Load Base
model_id = "facebook/nllb-200-distilled-600M"
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

# Load Adapter
adapter_id = "mutaician/nllb-kalenjin-swahili-v1"
model = PeftModel.from_pretrained(model, adapter_id)

# Load Tokenizer
tokenizer = NllbTokenizerFast.from_pretrained(adapter_id)
tokenizer.src_lang = "luo_Latn" # Important: Kalenjin routed via Luo token

text = "Iyamunee"
inputs = tokenizer(text, return_tensors="pt")

target_lang_id = tokenizer.convert_tokens_to_ids("swa_Latn")

with torch.no_grad():
    generated_tokens = model.generate(
        **inputs,
        forced_bos_token_id=target_lang_id,
        num_beams=5,
        early_stopping=True,
        max_length=256
    )

print(tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0])

Downloads last month: 26

Model tree for mutaician/nllb-kalenjin-swahili-v1

Base model

facebook/nllb-200-distilled-600M

Adapter

(74)

this model

mutaician
/

nllb-kalenjin-swahili-v1

NLLB-200 600M - Kalenjin to Swahili (LoRA Adapter)

📊 Evaluation Results (KLN -> SWA)

🛠️ Technical Details & Training

💻 Usage

Model tree for mutaician/nllb-kalenjin-swahili-v1

Dataset used to train mutaician/nllb-kalenjin-swahili-v1

Space using mutaician/nllb-kalenjin-swahili-v1 1