---
license: cc-by-nc-4.0
language:
- la
base_model:
- answerdotai/ModernBERT-base
pipeline_tag: fill-mask
tags:
- Latin
---


<img src="https://cdn-uploads.huggingface.co/production/uploads/62cf05b026c94b143172379c/jYm4W_5JCbcdZJemb3n0d.png"
     style="float:left;width:200px;height:200px;object-fit:cover;border-radius:50%;margin-right:16px;" />

LAMB (**LA**tin **M**odern**B**ERT) is a Latin encoder-only model based on the ModernBERT architecture, pre-trained on nearly **24B Latin tokens**, and ready for use with any Latin orthography. 

### Features


### Usage

#### Predicting Masked Tokens
```python
from transformers import AutoTokenizer, AutoModelForMaskedLM

model_id = "aimgo/LAMB"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(model_id)

text = "et ecce tu eras [MASK] me et ego foris, et ibi te quaerebam"

inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

masked_index = inputs["input_ids"][0].tolist().index(tokenizer.mask_token_id)
predicted_token_id = outputs.logits[0, masked_index].argmax(axis=-1)
predicted_token = tokenizer.decode(predicted_token_id)

print("Input:", text)
print("Predicted:", predicted_token)
```

If you use this in your work, please cite: 
```
@misc{mccarthy2025LAMB,
  author       = {McCarthy, A. M.},
  title        = {{LAMB}: A Modern Masked Language Model for Latin},
  year         = {2025},
  howpublished = {\url{https://huggingface.co/aimgo/LAMB}},
  note         = {Model}
}
```