1 3 31

Zhenghao Xu

zhenghaoxu

AI & ML interests

None yet

Recent Activity

upvoted an article about 1 month ago

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

commentedon a paper about 2 months ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

updated a dataset about 2 months ago

zhenghaoxu/aime-beyond

View all activity

Organizations

upvoted an article about 1 month ago

Article

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

Mar 10

•

124

commented a paper about 2 months ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 220 •

updated 2 datasets about 2 months ago

zhenghaoxu/aime-beyond

Viewer • Updated Feb 22 • 100 • 3

zhenghaoxu/aime-amc23

Viewer • Updated Feb 22 • 40 • 52

published a dataset about 2 months ago

zhenghaoxu/aime-amc23

Viewer • Updated Feb 22 • 40 • 52

updated 4 datasets about 2 months ago

updated a dataset 2 months ago

zhenghaoxu/dapo-math-17k

Viewer • Updated Feb 13 • 17.4k • 48

published 6 datasets 2 months ago

zhenghaoxu/dapo-math-17k

Viewer • Updated Feb 13 • 17.4k • 48

zhenghaoxu/aime-beyond

Viewer • Updated Feb 22 • 100 • 3

zhenghaoxu/aime-2026

Viewer • Updated Feb 22 • 30 • 46

zhenghaoxu/aime-2025

Viewer • Updated Feb 22 • 30 • 48

zhenghaoxu/aime-2024

Viewer • Updated Feb 22 • 30 • 46

zhenghaoxu/math-aime-eval

Viewer • Updated Feb 22 • 230 • 7

upvoted 2 papers 2 months ago

Approximation of Log-Partition Function in Policy Mirror Descent Induces Implicit Regularization for LLM Post-Training

Paper • 2602.05933 • Published Feb 5 • 6

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Paper • 2602.01058 • Published Feb 1 • 43

liked 2 models 4 months ago

inclusionAI/LLaDA2.0-flash

Text Generation • 103B • Updated Dec 19, 2025 • 872 • 68

inclusionAI/LLaDA2.0-mini

Text Generation • 16B • Updated 2 days ago • 81.5k • 64

Zhenghao Xu

AI & ML interests

Recent Activity

Organizations

zhenghaoxu's activity

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries