| --- |
| license: apache-2.0 |
| language: |
| - en |
| tags: |
| - large-language-model |
| - multi-agent-systems |
| - reinforcement-learning |
| - agentic-ai |
| - code |
| - math |
| --- |
| |
| # MetaAgent-X: Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning |
|
|
| [Paper π](https://arxiv.org/abs/2605.14212) |
|
|
| [Codebase π](https://github.com/pettingllms-ai/PettingLLMs) |
|
|
| [Project Page π](https://mercury7353.github.io/MetaAgent-X-Page/) |
|
|
| ## Overview |
|
|
| **MetaAgent-X** is an end-to-end reinforcement learning framework for autonomous multi-agent systems. |
|
|
| Unlike conventional automatic MAS methods that rely on frozen models, hand-crafted prompts, or search-based workflows, MetaAgent-X trains one shared model to both **design** a multi-agent system and **execute** it. The model learns to generate task-adaptive agent roles, collaboration structures, and execution strategies through reinforcement learning. |
|
|
| MetaAgent-X demonstrates strong cross-domain adaptation and achieves state-of-the-art performance across both **code** and **math** benchmarks. |
|
|
| ## Key Features |
|
|
| - **One model for both design and execution**: the same model acts as both the MAS designer and the task executor. |
| - **End-to-end reinforcement learning**: the model is optimized directly from downstream task outcomes. |
| - **Autonomous multi-agent system generation**: the model learns to construct and execute agent swarms for complex reasoning tasks. |
| - **Cross-domain generalization**: strong performance on both coding and mathematical reasoning benchmarks. |
|
|
| ## Results |
|
|
| The following table reports the performance of **MetaAgent-X<sub>RL</sub>**. |
| Numbers in parentheses denote absolute gains over the single-agent baseline. |
|
|
| | Domain | Benchmark | MetaAgent-X<sub>RL</sub> | |
| |---|---:|---:| |
| | Code | LiveCodeBench | **41.00** | |
| | Code | APPS | **38.00** | |
| | Code | CodeContests | **17.00** | |
| | Math | AIME24 | **40.00** | |
| | Math | AIME25 | **33.33** | |
| | Math | OlympiadBench | **61.00** | |
| | Overall | Average | **38.33** | |
|
|
| ## Citation |
| ``` |
| @misc{zhang2026metaagentxbreakingceiling, |
| title={MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning}, |
| author={Yaolun Zhang and Yujie Zhao and Nan Wang and Yiran Wu and Jiayu Chang and Yizhao Chen and Qingyun Wu and Jishen Zhao and Huazheng Wang}, |
| year={2026}, |
| eprint={2605.14212}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.AI}, |
| url={https://arxiv.org/abs/2605.14212}, |
| } |
| ``` |