feat(training): Phase C6 β ablations, training curves, readme finalization e46f00b Prasham.Jain Claude Sonnet 4.6 commited on 13 days ago
feat(training): Phase C5 β evaluation harness, baselines, plots, readme table 93e68bc Prasham.Jain Claude Sonnet 4.6 commited on 13 days ago
feat(training): Phase C4 β GRPO training, SFT warmstart, rollout, custom trainer 3e29c8b Prasham.Jain Claude Sonnet 4.6 commited on 13 days ago
feat(training): Phase C3 β SFT trajectory generator, env clients, mock env 5ae5581 Prasham.Jain Claude Sonnet 4.6 commited on 13 days ago