anuragredbus's picture
train_grpo: add TEST_ONLY mode to skip training and run eval+plots only
7db31d9