flatmate_rl / train_flatmate_rl_grpo_step2.ipynb
kushalExplores's picture
Add step-2 GRPO notebook and hidden-flex fix
dbb1ce2 verified
Open in Colab
Rendering notebook...