final-iteration / training /train_grpo_smoke.ipynb

Commit History

Default repo clone branch to main for training notebooks and HF script.
ad48770

anuragredbus commited on

Set TASK_HORIZON to 15 days and align graders, UI, and training prompts.
99717c2

anuragredbus commited on

add train_grpo_smoke notebook; quote pip versions in train_grpo
b55c1ff

anuragredbus commited on