ACO v11: Agent Cost Optimizer
A universal control layer that reduces autonomous agent cost while preserving task quality. Trained on real execution data from SPROUT (31K rows, 13 models) + SWE-Router (500 tasks, 8 models).
What It Does
ACO sits in front of any agent harness and makes cost-aware decisions:
- Which model to use (tiny โ frontier โ specialist)
- Whether to escalate based on output confidence
- How much context to include
- Whether to call tools
- Whether to verify outputs
- When to stop failing runs
v11 Results (Real SWE-bench, 500 tasks ร 8 models)
| Policy | Success | Cost/Task | CostRed |
|---|---|---|---|
| Oracle | 87.0% | $0.06 | 80.3% |
| v11 + feedback | 74.8% | $0.20 | 36.9% |
| v11 cascade | 67.4% | $0.12 | 62.5% |
| Always frontier | 78.2% | $0.32 | baseline |
| v8 (synthetic) | 65.8% | $0.35 | -11.6% |
v9 Results (Synthetic, 3K traces)
| Policy | Success | CostRed |
|---|---|---|
| v9 + feedback | 90.0% | 2.1% |
| v8 router | 83.7% | 8.5% |
| Always frontier | 90.0% | baseline |
Key Finding
Training data matters more than architecture. v8 trained on synthetic data increased cost by 11.6%. v10 trained on 500 real outcomes saved 23.3%. v11 with 31K SPROUT rows saves 36.9%. Same XGBoost architecture throughout.
Quick Start
from aco.router_v10 import V10Router
from aco.per_step_router import PerStepRouter
# Task-level routing
v10 = V10Router(model_path="router_models/router_bundle_v11.pkl", success_threshold=0.70)
d = v10.route_cascade("Fix the auth bug in production")
print(f"Tier: {d.tier}, Model: {d.model}, Cost: ${d.cost_estimate:.2f}")
# Per-step routing
ps = PerStepRouter(max_budget=2.0)
d = ps.route_step("Search for the bug", step_num=1, task_risk="medium")
print(f"Step: {d.step_type.value}, Tier: {d.adjusted_tier}")
11 Modules
- Cost Telemetry Collector
- Task Cost Classifier
- Model Cascade Router (v11 XGBoost)
- Execution-Feedback Router (entropy cascade)
- Context Budgeter
- Cache-Aware Prompt Layout
- Tool-Use Cost Gate
- Verifier Budgeter
- Retry/Recovery Optimizer
- Meta-Tool Miner
- Doom Detector
Links
- Model: narcolepticchicken/agent-cost-optimizer
- Dataset: narcolepticchicken/agent-cost-traces
- Dashboard: narcolepticchicken/aco-dashboard
- Blog Post: docs/technical_blog.md
- Final Report: docs/final_report.md
License
MIT
Generated by ML Intern
This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.
- Try ML Intern: https://smolagents-ml-intern.hf.space
- Source code: https://github.com/huggingface/ml-intern
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = 'narcolepticchicken/agent-cost-optimizer'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support