Mercury7353
/

MetaAgent-X

Reinforcement Learning

large-language-model

multi-agent-systems

Model card Files Files and versions

MetaAgent-X / README.md

Mercury7353's picture

Update README.md

e7dd47c verified about 10 hours ago

|

history blame contribute delete

2.5 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- large-language-model
	- multi-agent-systems
	- reinforcement-learning
	- agentic-ai
	- code
	- math
	---

	# MetaAgent-X: Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning

	[Paper 📑](https://arxiv.org/abs/2605.14212)

	[Codebase 🚗](https://github.com/pettingllms-ai/PettingLLMs)

	[Project Page 🏆](https://mercury7353.github.io/MetaAgent-X-Page/)

	## Overview

	MetaAgent-X is an end-to-end reinforcement learning framework for autonomous multi-agent systems.

	Unlike conventional automatic MAS methods that rely on frozen models, hand-crafted prompts, or search-based workflows, MetaAgent-X trains one shared model to both design a multi-agent system and execute it. The model learns to generate task-adaptive agent roles, collaboration structures, and execution strategies through reinforcement learning.

	MetaAgent-X demonstrates strong cross-domain adaptation and achieves state-of-the-art performance across both code and math benchmarks.

	## Key Features

	- One model for both design and execution: the same model acts as both the MAS designer and the task executor.
	- End-to-end reinforcement learning: the model is optimized directly from downstream task outcomes.
	- Autonomous multi-agent system generation: the model learns to construct and execute agent swarms for complex reasoning tasks.
	- Cross-domain generalization: strong performance on both coding and mathematical reasoning benchmarks.

	## Results

	The following table reports the performance of MetaAgent-X<sub>RL</sub>.
	Numbers in parentheses denote absolute gains over the single-agent baseline.

	\| Domain \| Benchmark \| MetaAgent-X<sub>RL</sub> \|
	\|---\|---:\|---:\|
	\| Code \| LiveCodeBench \| 41.00 \|
	\| Code \| APPS \| 38.00 \|
	\| Code \| CodeContests \| 17.00 \|
	\| Math \| AIME24 \| 40.00 \|
	\| Math \| AIME25 \| 33.33 \|
	\| Math \| OlympiadBench \| 61.00 \|
	\| Overall \| Average \| 38.33 \|

	## Citation
	```
	@misc{zhang2026metaagentxbreakingceiling,
	title={MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning},
	author={Yaolun Zhang and Yujie Zhao and Nan Wang and Yiran Wu and Jiayu Chang and Yizhao Chen and Qingyun Wu and Jishen Zhao and Huazheng Wang},
	year={2026},
	eprint={2605.14212},
	archivePrefix={arXiv},
	primaryClass={cs.AI},
	url={https://arxiv.org/abs/2605.14212},
	}
	```