1 14 15

Yangyi Chen

YangyiYY

https://yangyi-chen.github.io/

AI & ML interests

Multimodal, Large Language Models

Recent Activity

new activity 10 days ago

MiniMaxAI/MiniMax-M2.5:Add SWE-Bench Pro evaluation result

upvoted a collection 13 days ago

NVIDIA Nemotron v3

upvoted a collection 24 days ago

Nemotron-Cascade 2

View all activity

Organizations

None yet

upvoted a collection 13 days ago

NVIDIA Nemotron v3

Collection

Open, Production-ready Enterprise Models • 15 items • Updated 8 days ago • 267

upvoted a collection 24 days ago

Nemotron-Cascade 2

Collection

Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation • 4 items • Updated 8 days ago • 48

upvoted a paper 25 days ago

Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

Paper • 2603.19220 • Published 26 days ago • 66

upvoted a paper 3 months ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 230

upvoted a paper 4 months ago

Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

Paper • 2512.13607 • Published Dec 15, 2025 • 38

upvoted a collection 4 months ago

Nemotron-Cascade

Collection

Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 14 items • Updated 8 days ago • 54

upvoted a paper 4 months ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26, 2025 • 126

upvoted a paper 9 months ago

Perception-Aware Policy Optimization for Multimodal Reasoning

Paper • 2507.06448 • Published Jul 8, 2025 • 48

upvoted 2 papers 11 months ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30, 2025 • 146

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5, 2025 • 81

upvoted a paper 12 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17, 2025 • 97

upvoted an article about 1 year ago

Article

Putting RL back in RLHF

Jun 12, 2024

•

111

upvoted a paper about 1 year ago

Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks

Paper • 2501.11733 • Published Jan 20, 2025 • 28

upvoted a paper over 1 year ago

OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis

Paper • 2501.04561 • Published Jan 8, 2025 • 16

Yangyi Chen

AI & ML interests

Recent Activity

Organizations

YangyiYY's activity

Putting RL back in RLHF