translategemma-4b-it-nb-nn
A fine-tuned Gemma 3 4B Instruct model for translating Norwegian Bokmål (nb) to Norwegian Nynorsk (nn) intended for deployment testing by the NB-ASR beta program.
Uploaded: 07-04-2026
The immediate purpose of this release is to support:
-reproducible beta evaluation,
-loading and inference validation in realistic environments,
-and packaging of a reviewed checkpoint for Hugging Face distribution.
Confidential beta release: this model card and the associated weights are intended for approved evaluators and collaborators. Treat the checkpoint as beta material rather than a public production release.
Model Description
This model was fine-tuned from google/gemma-3-4b-it on the NbAiLab/merged_npk_ndla_parallel_paragraphs dataset.
Intended Use
- Primary use: Translating Norwegian Bokmål text to Norwegian Nynorsk
- Language pair:
nb → nn
Training Data
The model was trained on NbAiLab/merged_npk_ndla_parallel_paragraphs, a merged corpus of parallel Bokmål–Nynorsk paragraphs from NPK and NDLA.
Training Details
| Parameter | Value |
|---|---|
| Base model | google/gemma-3-4b-it |
| Epochs | 3 |
| Global steps | 46,728 |
| Precision | bfloat16 |
| Optimizer | AdamW (β1=0.9, β2=0.999, ε=1e-8) |
| Weight decay | — |
| Warmup ratio | 0.1 |
| Eval strategy | Every 2,000 steps |
| Dataloader workers | 4 |
| Train samples/sec | 103.05 |
| Train runtime | ~8.1 hours |
Evaluation Results
Evaluated on two test sets at the end of training (epoch 3):
NbAiLab Test Set (in-domain)
| Metric | Score |
|---|---|
| BLEU | 89.02 |
| chrF | 95.37 |
| Loss | 0.0750 |
Tatoeba nb→nn (out-of-domain)
| Metric | Score |
|---|---|
| BLEU | 72.20 |
| chrF | 85.68 |
| Loss | 0.4106 |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "NbAiLab/translategemma-4b-it-nb-nn"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="bfloat16", device_map="auto")
def translate_nb_to_nn(text: str) -> str:
prompt = (
"<start_of_turn>user\n"
"You are a professional Norwegian (no) to Norwegian Nynorsk (nn) translator. "
"Your goal is to accurately convey the meaning and nuances of the original Norwegian text while "
"adhering to Norwegian Nynorsk grammar, vocabulary, and cultural sensitivities. Produce only the "
"Norwegian Nynorsk translation, without any additional explanations or commentary. Please translate "
"the following Norwegian text into Norwegian Nynorsk:\n\n\n"
f"{text}<end_of_turn>\n"
"<start_of_turn>model\n"
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
# Decode only the newly generated tokens (skip the prompt)
generated = outputs[0][inputs["input_ids"].shape[-1]:]
return tokenizer.decode(generated, skip_special_tokens=True).strip()
text = "Dette er en setning på bokmål som skal oversettes til nynorsk."
print(translate_nb_to_nn(text))
License
This model is subject to the Gemma Terms of Use.
- Downloads last month
- 1