notebooks: add V4.1 GRPO notebook (parser fix, 600 steps, LR 5e-6, constant_with_warmup) d7a090d verified rtferraz commited on 25 days ago