ACO: Agent Cost Optimizer
A universal control layer that reduces autonomous agent cost while preserving task quality.
Quick Results (SWE-bench, 500 coding tasks, 8 real models)
| Policy | Success | Cost/Task | CostRed |
|---|---|---|---|
| Oracle | 87.0% | $0.062 | 80.3% |
| v10+feedback | 84.8% | $0.201 | 36.4% |
| v10 direct | 76.6% | $0.188 | 40.7% |
| Always frontier | 78.2% | $0.317 | baseline |
| Always cheap | 63.2% | $0.014 | 95.5% |
Key finding: v10+feedback strictly dominates always-frontier β lower cost AND higher quality. This is not a cost-quality tradeoff.
BERT Router Results
DistilBERT was fine-tuned on SPROUT for binary classification. The binary classifier fails for tier routing β it ignores tier prefixes and predicts P(success) β 89.5% for all tiers, routing everything to the cheapest model.
A 5-class retraining is in progress (job 69fd8cccaff1cd33e8f30714).
11 Modules
- Cost Telemetry Collector β
aco/telemetry.py - Task Cost Classifier β
aco/classifier.py - Model Cascade Router (XGBoost + isotonic) β
aco/router_v10.py - Execution-Feedback Router (entropy cascade) β
aco/execution_feedback.py - Context Budgeter β
aco/context_budgeter.py - Cache-Aware Prompt Layout β
aco/cache_layout.py - Tool-Use Cost Gate β
aco/tool_gate.py - Verifier Budgeter β
aco/verifier_budgeter.py - Retry/Recovery Optimizer β
aco/retry_optimizer.py - Meta-Tool Miner β
aco/meta_tool_miner.py - Doom Detector β
aco/doom_detector.py
New Modules (this session)
- Conformal Calibration β
aco/conformal.pyβ RouteNLP-style distribution-free escalation guarantees - Pareto Frontier β
aco/pareto.pyβ RouterBench NDCH + RouteLLM CPT/APGR metrics - Integration Test β
tests/test_integration.pyβ Full pipeline test
Key Takeaway
Training on real execution data matters more than architecture. v8 trained on synthetic data increased cost by 11.6%. v10 trained on 500 real SWE-Router outcomes saved 36.4%. Same XGBoost, same features.
Documentation
- Final Report
- Pareto Frontier Report
- Conformal Calibration Report
- BERT Eval Report
- Literature Review
- Deployment Guide
- Technical Blog
- Roadmap
Links
- Model: narcolepticchicken/agent-cost-optimizer
- Dataset: narcolepticchicken/agent-cost-traces
- Dashboard: narcolepticchicken/aco-dashboard