view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge Feb 7, 2025 • 287
view article Article A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond Jan 19 • 13