OLMo-2-7B FineTranslation + WIKI-FACT GRPO (Attention + MLP)
This model is based on allenai/OLMo-2-1124-7B.
It was first continuously pretrained on FineTranslations, and then GRPO-trained on WIKI-FACT, targeting attention and MLP layers.
Training recipe
- Base model:
allenai/OLMo-2-1124-7B - Continued pretraining on FineTranslations
- GRPO training on WIKI-FACT
- Trainable target modules: attention + MLP layers
Notes
- This repository contains the merged checkpoint.
- Intended for research on multilingual factuality and cross-lingual consistency.
- Downloads last month
- 306