NLLB-200-Distilled-600M - Latin to German Translation
This model is a fine-tuned version of the Meta NLLB-200-Distilled-600M model, specifically trained for Machine Translation (MT) from Latin (la) to German (de).
Model Details
| Metainfo | Value |
|---|---|
| Base Model | facebook/nllb-200-distilled-600M |
| Pipeline Tag | translation |
| Library | transformers |
| Source Language | Latin (la) |
| Target Language | German (de) |
| License | CC BY-NC 4.0 |
Intended Use
This model is primarily designed for the direct translation of texts from the Latin language into the German language. It is suitable for academic, historical, and general translation tasks.
Limitations and Bias
- The model was trained on a specific dataset, and its coverage and style influence the translation quality.
- Like all large language models, it may contain errors, hallucinations, and bias stemming from the training data. Given that Latin is a language with limited modern corpora, translating it may present specific challenges with rare words, complex sentence structures, or contextual ambiguity.
- No commercial use is permitted, as it is released under the CC BY-NC 4.0 (NonCommercial) license.
Training Code
You can find the training-code on GitHub.
Training Data
You can find a Latin-German dataset also on Hugging Face.
📚 Citation / References
If you use this model in your research, please cite the underlying master's thesis as follows:
Masterarbeit (Zenodo DOI):
Wenzel, M. (2025). Translatio ex Machina: Neuronale Maschinelle Übersetzung vom Lateinischen ins Deutsche [Zenodo]. Unveröffentlichte Masterarbeit, Fachhochschule Südwestfalen
How to Use the Model
You can use this model directly with the Hugging Face transformers library.
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model_name = "fhswf/nllb-600M-lat_Latn-to-deu_Latn"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
SOURCE_LANGUAGE = "lat_Latn"
TARGET_LANGUAGE = "deu_Latn"
# Latin input text
latin_text = "Veni, vidi, vici."
tokenizer.src_lang = SOURCE_LANGUAGE
inputs = tokenizer(latin_text, return_tensors="pt")
# Generating the translation
translated_tokens = model.generate(
**inputs,
max_length=128,
num_beams=4,
early_stopping=True
)
translated_tokens = model.generate(
**inputs,
forced_bos_token_id=tokenizer.convert_tokens_to_ids(TARGET_LANGUAGE),
max_length=128,
num_beams=4,
early_stopping=True
)
# Decoding the result
translation = tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]
print(f"Latin: {latin_text}")
print(f"German: {translation}")
# Expected Output: German: Ich kam, sah und siegte.
- Downloads last month
- 7
Model tree for fhswf/nllb-600M-lat_Latn-to-deu_Latn
Base model
facebook/nllb-200-distilled-600M