ACO v11: Agent Cost Optimizer

A universal control layer that reduces autonomous agent cost while preserving task quality. Trained on real execution data from SPROUT (31K rows, 13 models) + SWE-Router (500 tasks, 8 models).

What It Does

ACO sits in front of any agent harness and makes cost-aware decisions:

  • Which model to use (tiny โ†’ frontier โ†’ specialist)
  • Whether to escalate based on output confidence
  • How much context to include
  • Whether to call tools
  • Whether to verify outputs
  • When to stop failing runs

v11 Results (Real SWE-bench, 500 tasks ร— 8 models)

Policy Success Cost/Task CostRed
Oracle 87.0% $0.06 80.3%
v11 + feedback 74.8% $0.20 36.9%
v11 cascade 67.4% $0.12 62.5%
Always frontier 78.2% $0.32 baseline
v8 (synthetic) 65.8% $0.35 -11.6%

v9 Results (Synthetic, 3K traces)

Policy Success CostRed
v9 + feedback 90.0% 2.1%
v8 router 83.7% 8.5%
Always frontier 90.0% baseline

Key Finding

Training data matters more than architecture. v8 trained on synthetic data increased cost by 11.6%. v10 trained on 500 real outcomes saved 23.3%. v11 with 31K SPROUT rows saves 36.9%. Same XGBoost architecture throughout.

Quick Start

from aco.router_v10 import V10Router
from aco.per_step_router import PerStepRouter

# Task-level routing
v10 = V10Router(model_path="router_models/router_bundle_v11.pkl", success_threshold=0.70)
d = v10.route_cascade("Fix the auth bug in production")
print(f"Tier: {d.tier}, Model: {d.model}, Cost: ${d.cost_estimate:.2f}")

# Per-step routing
ps = PerStepRouter(max_budget=2.0)
d = ps.route_step("Search for the bug", step_num=1, task_risk="medium")
print(f"Step: {d.step_type.value}, Tier: {d.adjusted_tier}")

11 Modules

  1. Cost Telemetry Collector
  2. Task Cost Classifier
  3. Model Cascade Router (v11 XGBoost)
  4. Execution-Feedback Router (entropy cascade)
  5. Context Budgeter
  6. Cache-Aware Prompt Layout
  7. Tool-Use Cost Gate
  8. Verifier Budgeter
  9. Retry/Recovery Optimizer
  10. Meta-Tool Miner
  11. Doom Detector

Links

License

MIT

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = 'narcolepticchicken/agent-cost-optimizer'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support