Flawed Fictions GRPO (olmo)
Training Details
| Base model | allenai/Olmo-3-7B-Instruct |
| Task | Continuity error detection (\boxed{Yes} / \boxed{No}) |
| W&B group | grpo_flawed_fictions_olmo |
| W&B runs | tops83e4, f1qhcich |
| Training script | scripts/grpo_4gpu_olmo_train.sh |
Checkpoint Revisions
- Branch head (latest):
main - Per-checkpoint tags:
main-step-<N>
Usage
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
"agurung/flawed-fictions-olmo-3-7b",
revision="main",
device_map="auto",
torch_dtype="auto",
)
- Downloads last month
- 3
Model tree for agurung/flawed-fictions-olmo-3-7b
Base model
allenai/Olmo-3-1025-7B Finetuned
allenai/Olmo-3-7B-Instruct-SFT Finetuned
allenai/Olmo-3-7B-Instruct-DPO Finetuned
allenai/Olmo-3-7B-Instruct