# ACO: Agent Cost Optimizer A universal control layer that reduces autonomous agent cost while preserving task quality. ## Quick Results (SWE-bench, 500 coding tasks, 8 real models) | Policy | Success | Cost/Task | CostRed | |--------|---------|-----------|---------| | Oracle | 87.0% | $0.062 | 80.3% | | **v10+feedback** | **84.8%** | **$0.201** | **36.4%** | | v10 direct | 76.6% | $0.188 | 40.7% | | Always frontier | 78.2% | $0.317 | baseline | | Always cheap | 63.2% | $0.014 | 95.5% | **Key finding: v10+feedback strictly dominates always-frontier** — lower cost AND higher quality. This is not a cost-quality tradeoff. ## BERT Router Results DistilBERT was fine-tuned on SPROUT for binary classification. The binary classifier fails for tier routing — it ignores tier prefixes and predicts P(success) ≈ 89.5% for all tiers, routing everything to the cheapest model. A 5-class retraining is in progress (job `69fd8cccaff1cd33e8f30714`). ## 11 Modules 1. Cost Telemetry Collector — `aco/telemetry.py` 2. Task Cost Classifier — `aco/classifier.py` 3. Model Cascade Router (XGBoost + isotonic) — `aco/router_v10.py` 4. Execution-Feedback Router (entropy cascade) — `aco/execution_feedback.py` 5. Context Budgeter — `aco/context_budgeter.py` 6. Cache-Aware Prompt Layout — `aco/cache_layout.py` 7. Tool-Use Cost Gate — `aco/tool_gate.py` 8. Verifier Budgeter — `aco/verifier_budgeter.py` 9. Retry/Recovery Optimizer — `aco/retry_optimizer.py` 10. Meta-Tool Miner — `aco/meta_tool_miner.py` 11. Doom Detector — `aco/doom_detector.py` ## New Modules (this session) - **Conformal Calibration** — `aco/conformal.py` — RouteNLP-style distribution-free escalation guarantees - **Pareto Frontier** — `aco/pareto.py` — RouterBench NDCH + RouteLLM CPT/APGR metrics - **Integration Test** — `tests/test_integration.py` — Full pipeline test ## Key Takeaway Training on real execution data matters more than architecture. v8 trained on synthetic data *increased* cost by 11.6%. v10 trained on 500 real SWE-Router outcomes *saved* 36.4%. Same XGBoost, same features. ## Documentation - [Final Report](docs/final_report_v2.md) - [Pareto Frontier Report](docs/pareto_frontier_report.md) - [Conformal Calibration Report](docs/conformal_report.md) - [BERT Eval Report](docs/bert_eval_report.md) - [Literature Review](docs/literature_review.md) - [Deployment Guide](docs/deployment_guide.md) - [Technical Blog](docs/technical_blog.md) - [Roadmap](docs/ROADMAP.md) ## Links - **Model**: [narcolepticchicken/agent-cost-optimizer](https://huggingface.co/narcolepticchicken/agent-cost-optimizer) - **Dataset**: [narcolepticchicken/agent-cost-traces](https://huggingface.co/datasets/narcolepticchicken/agent-cost-traces) - **Dashboard**: [narcolepticchicken/aco-dashboard](https://huggingface.co/spaces/narcolepticchicken/aco-dashboard)