Papers
arxiv:2605.19330

MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization

Published on May 19
· Submitted by
Mehrab Tanjim
on May 21
Authors:
,
,
,
,
,
,
,
,

Abstract

LLM agents organize behavior through skills - structured natural-language specifications governing how an agent reasons, retrieves, and responds. Unlike monolithic prompts, skills are multi-field artifacts subject to hard platform constraints: description fields are truncated for routing, instruction bodies are compacted via progressive disclosure, and co-resident skills compete for limited context windows. These constraints make skill optimization inherently multi-objective: a skill must simultaneously maximize task performance and satisfy platform limits. Yet existing prompt optimizers either ignore these trade-offs or collapse them into a weighted sum, missing Pareto-optimal variants in non-convex objective regions. We introduce MOCHA (Multi-Objective Chebyshev Annealing), which replaces single-objective selection with Chebyshev scalarization - covering the full Pareto front, including non-convex regions - combined with exponential annealing that transitions from exploration to exploitation. In our experiments across six diverse agent skills - where all methods share the same multi-objective mutation operator and baselines receive identical per-objective textual feedback - existing optimizers fail to improve the seed skill on 4 of 6 tasks: 1000 rollouts yield zero progress. MOCHA breaks through on every task, achieving 7.5% relative improvement in mean correctness over the strongest baseline (up to 14.9% on FEVER and 10.4% on TheoremQA) while discovering twice as many more Pareto-optimal skill variants.

Community

Skill optimization is inherently multi-objective: a skill must maximize task correctness and satisfy hard platform limits (truncated descriptions, compacted instruction bodies, finite shared context). Prior prompt optimizers either ignore these trade-offs or collapse them into a single scalar, missing Pareto-optimal variants in non-convex regions. MOCHA replaces single-objective selection with Chebyshev scalarization — provably covering the full Pareto front — combined with exponential annealing that transitions from exploration to exploitation as the rollout budget is consumed. Across six diverse skills, MOCHA beats the strongest baseline by 7.5% on average (up to +14.9%) and finds 2× more Pareto-optimal variants, while existing optimizers plateau at the seed on 4 of 6 tasks.
teaser_non_convex

fig_evolution

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.19330
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.19330 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.19330 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.19330 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.