Khasi-English Translation Model (Stage B - Checkpoint 12000)

Fine-tuned facebook/nllb-200-distilled-600M (615M params) for English-Khasi translation.

Training Details

Detail Value
Base Model facebook/nllb-200-distilled-600M
Task Machine Translation (English <-> Khasi)
Stage Stage B - Full Silver Training
Checkpoint checkpoint-12000 (Epoch 2.77/3.0)
Training Data 138k parallel pairs
GPUs 2x NVIDIA A100-SXM4-40GB

Metrics (at checkpoint-12000)

Metric Score
BLEU 53.75
chrF 67.30
Eval Loss 0.6585

Usage

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained("techno-tuners/khasi-nllb-stage-b-checkpoint-12000")
tokenizer = AutoTokenizer.from_pretrained("techno-tuners/khasi-nllb-stage-b-checkpoint-12000")
tokenizer.src_lang = "eng_Latn"

inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs, forced_bos_token_id=tokenizer.convert_tokens_to_ids("kha_Latn"), max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Note: This is a training checkpoint, not the final production model.

Downloads last month
13
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for techno-tuners/khasi-nllb-stage-b-checkpoint-12000

Finetuned
(274)
this model