agent RL - a ReneeA1 Collection

ReneeA1 's Collections

agent RL

updated Dec 10, 2025

Tool-integrated Reinforcement Learning for Repo Deep Search

Paper • 2508.03012 • Published Aug 5, 2025 • 20
Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5, 2025 • 140
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents

Paper • 2509.09265 • Published Sep 11, 2025 • 47
A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 193
WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

Paper • 2509.06501 • Published Sep 8, 2025 • 82
Reinforcement Learning Foundations for Deep Research Systems: A Survey

Paper • 2509.06733 • Published Sep 8, 2025 • 32
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published Sep 3, 2025 • 23
DCPO: Dynamic Clipping Policy Optimization

Paper • 2509.02333 • Published Sep 2, 2025 • 22
PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

Paper • 2508.21104 • Published Aug 28, 2025 • 37
DeepCode: Open Agentic Coding

Paper • 2512.07921 • Published Dec 8, 2025 • 33