Default repo clone branch to main for training notebooks and HF script. ad48770 anuragredbus commited on 12 days ago
Set TASK_HORIZON to 15 days and align graders, UI, and training prompts. 99717c2 anuragredbus commited on 12 days ago
add train_grpo_smoke notebook; quote pip versions in train_grpo b55c1ff anuragredbus commited on 13 days ago