OLMo-2-7B FineTranslation + WIKI-FACT GRPO (Attention + MLP)

This model is based on allenai/OLMo-2-1124-7B.

It was first continuously pretrained on FineTranslations, and then GRPO-trained on WIKI-FACT, targeting attention and MLP layers.

Training recipe

  1. Base model: allenai/OLMo-2-1124-7B
  2. Continued pretraining on FineTranslations
  3. GRPO training on WIKI-FACT
  4. Trainable target modules: attention + MLP layers

Notes

  • This repository contains the merged checkpoint.
  • Intended for research on multilingual factuality and cross-lingual consistency.
Downloads last month
306
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support