NLLB-200-distilled-600M (mantisorg fork)

Fork of facebook/nllb-200-distilled-600M for evaluation in the Mantis news translation pipeline.

Eval results (50 financial/political news headlines)

Language BLEU chrF
Farsi (fa) 36.4 70.0
Hebrew (he) 42.6 74.2

CTranslate2 compatibility

WARNING: CTranslate2 conversion produces degenerate output (repetitive tokens). Use only via HuggingFace transformers.

The ct2-transformers-converter does not handle NLLB's language tag mechanism correctly. Both int8 and float32 conversions produce garbage.

Usage (transformers)

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-200-distilled-600M", src_lang="pes_Arab")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-distilled-600M")

inputs = tokenizer("بانک مرکزی ایران نرخ بهره را افزایش داد", return_tensors="pt")
output = model.generate(**inputs, forced_bos_token_id=tokenizer.convert_tokens_to_ids("eng_Latn"))
print(tokenizer.decode(output[0], skip_special_tokens=True))
# "The Central Bank of Iran has raised interest rates."
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mantisorg/nllb-200-distilled-600M

Finetuned
(274)
this model