anuragredbus's picture
chore: align train_grpo.ipynb with smoke/syntax patterns for Colab
0587f05