Update README.md
Browse files
README.md
CHANGED
|
@@ -1,4 +1,52 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
tags:
|
| 6 |
+
- large-language-model
|
| 7 |
+
- multi-agent-systems
|
| 8 |
+
- reinforcement-learning
|
| 9 |
+
- agentic-ai
|
| 10 |
+
- code
|
| 11 |
+
- math
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# MetaAgent-X: Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning
|
| 15 |
+
|
| 16 |
+
[Paper: Coming Soon]()
|
| 17 |
+
|
| 18 |
+
[Codebase 🚗](https://github.com/pettingllms-ai/PettingLLMs)
|
| 19 |
+
|
| 20 |
+
## Overview
|
| 21 |
+
|
| 22 |
+
**MetaAgent-X** is an end-to-end reinforcement learning framework for autonomous multi-agent systems.
|
| 23 |
+
|
| 24 |
+
Unlike conventional automatic MAS methods that rely on frozen models, hand-crafted prompts, or search-based workflows, MetaAgent-X trains one shared model to both **design** a multi-agent system and **execute** it. The model learns to generate task-adaptive agent roles, collaboration structures, and execution strategies through reinforcement learning.
|
| 25 |
+
|
| 26 |
+
MetaAgent-X demonstrates strong cross-domain adaptation and achieves state-of-the-art performance across both **code** and **math** benchmarks.
|
| 27 |
+
|
| 28 |
+
## Key Features
|
| 29 |
+
|
| 30 |
+
- **One model for both design and execution**: the same model acts as both the MAS designer and the task executor.
|
| 31 |
+
- **End-to-end reinforcement learning**: the model is optimized directly from downstream task outcomes.
|
| 32 |
+
- **Autonomous multi-agent system generation**: the model learns to construct and execute agent swarms for complex reasoning tasks.
|
| 33 |
+
- **Cross-domain generalization**: strong performance on both coding and mathematical reasoning benchmarks.
|
| 34 |
+
|
| 35 |
+
## Results
|
| 36 |
+
|
| 37 |
+
The following table reports the performance of **MetaAgent-X<sub>RL</sub>**.
|
| 38 |
+
Numbers in parentheses denote absolute gains over the single-agent baseline.
|
| 39 |
+
|
| 40 |
+
| Domain | Benchmark | MetaAgent-X<sub>RL</sub> |
|
| 41 |
+
|---|---:|---:|
|
| 42 |
+
| Code | LiveCodeBench | **41.00** |
|
| 43 |
+
| Code | APPS | **38.00** |
|
| 44 |
+
| Code | CodeContests | **17.00** |
|
| 45 |
+
| Math | AIME24 | **40.00** |
|
| 46 |
+
| Math | AIME25 | **33.33** |
|
| 47 |
+
| Math | OlympiadBench | **61.00** |
|
| 48 |
+
| Overall | Average | **38.33** |
|
| 49 |
+
|
| 50 |
+
## Citation
|
| 51 |
+
|
| 52 |
+
Coming soon.
|