-
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 69 -
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
Paper • 2502.06060 • Published • 38 -
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper • 2502.14499 • Published • 195 -
SurveyX: Academic Survey Automation via Large Language Models
Paper • 2502.14776 • Published • 100
Collections
Discover the best community collections!
Collections including paper arxiv:2604.25917
-
Recursive Multi-Agent Systems
Paper • 2604.25917 • Published • 258 -
Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora
Paper • 2604.24819 • Published • 86 -
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning
Paper • 2604.02721 • Published • 627 -
ClawBench: Can AI Agents Complete Everyday Online Tasks?
Paper • 2604.08523 • Published • 261
-
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
Paper • 2604.22748 • Published • 224 -
OpenGame: Open Agentic Coding for Games
Paper • 2604.18394 • Published • 78 -
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
Paper • 2503.21460 • Published • 83 -
Large Language Model based Multi-Agents: A Survey of Progress and Challenges
Paper • 2402.01680 • Published • 2
-
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 69 -
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
Paper • 2502.06060 • Published • 38 -
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Paper • 2502.14499 • Published • 195 -
SurveyX: Academic Survey Automation via Large Language Models
Paper • 2502.14776 • Published • 100
-
Recursive Multi-Agent Systems
Paper • 2604.25917 • Published • 258 -
Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora
Paper • 2604.24819 • Published • 86 -
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning
Paper • 2604.02721 • Published • 627 -
ClawBench: Can AI Agents Complete Everyday Online Tasks?
Paper • 2604.08523 • Published • 261
-
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
Paper • 2604.22748 • Published • 224 -
OpenGame: Open Agentic Coding for Games
Paper • 2604.18394 • Published • 78 -
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
Paper • 2503.21460 • Published • 83 -
Large Language Model based Multi-Agents: A Survey of Progress and Challenges
Paper • 2402.01680 • Published • 2