arxiv:2605.08520

FlashEvolve: Accelerating Agent Self-Evolution with Asynchronous Stage Orchestration

Published on May 8

· Submitted by

Zhen Wang on May 12

University of California at San Diego

Upvote

Authors:

Zhen Wang ,

Abstract

FlashEvolve enhances LLM-based evolution frameworks by implementing asynchronous execution and artifact version tracking to reduce computational bottlenecks while maintaining evolutionary quality.

AI-generated summary

LLM-based evolution has emerged as a promising way to improve agents by refining non-parametric artifacts, but its wall-clock cost remains a major bottleneck. We identify that this cost comes from synchronized stage execution and imbalance inside each LLM-heavy stage. We present FlashEvolve, an efficient framework that replaces synchronized execution with asynchronous workers and queues, allowing different stages and steps to overlap. To handle data staleness introduced by asynchrony, FlashEvolve tracks artifact versions and applies different policies to update, discard, or patch stale artifacts. Unlike weight-space staleness in asynchronous RL, language-space staleness is inspectable and repairable: a stale artifact is not just delayed work, but readable evidence that the LLM can reflect on, revise, and turn into useful evolution signal. FlashEvolve further improves throughput and token efficiency with speculative stage completion and adaptive workflow control. On GEPA workloads, FlashEvolve improves proposal throughput by 3.5times on local vLLM and 4.9times on API serving over synchronous GEPA. The same design also applies to ACE and Meta-Harness.

View arXiv page View PDF Project page Add to collection

Community

zhenwang9102

Paper author Paper submitter about 11 hours ago

FlashEvolve makes LLM agent evolution (GEPA, ACE, Meta-Harness) 3.5–4.9× faster by replacing synchronous stage execution with asynchronous workers and queues. Its key insight is that unlike opaque weight-space staleness in async RL, language-space staleness is inspectable and repairable, so the LLM itself can reflectively patch stale prompts and code into useful evolution signal rather than discard them.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2605.08520

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.08520 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.08520 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.08520 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.