olmo3-7b-grpo-weighted-mul-creativity-step6
Olmo3-7B trained with GRPO (Weighted Mul) on Creativity dataset. Checkpoint: step 6
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Echoandland/olmo3-7b-grpo-weighted-mul-creativity-step6")
tokenizer = AutoTokenizer.from_pretrained("Echoandland/olmo3-7b-grpo-weighted-mul-creativity-step6")
- Downloads last month
- 1