metadata
tags:
- ml-intern
ACO: Agent Cost Optimizer
A universal control layer that bolts onto any agent harness to reduce total cost while preserving task quality.
Quick Start
pip install -e .
aco route "Debug this critical production bug"
aco budget "Research transformer advances"
aco gate web_search --task-type research
aco verify --risk high --confidence 0.7
aco stats
aco version
Results
On 2,000 synthetic traces across 9 task types:
| Router | Success | AvgCost | CostRed |
|---|---|---|---|
| always_frontier | 91.0% | $1.04 | baseline |
| heuristic | 84.5% | $0.92 | 11.6% |
| ACO v8 | 79.6% | $0.78 | 25.3% |
| always_cheap | 29.8% | $0.07 | 93.1% |
Key: ACO achieves 25% cost reduction. The verifier budgeter alone eliminates 88% of unnecessary verifications (238/2000 vs 2000/2000).
The 10 Modules
- Cost Telemetry Collector - Normalized JSON trace schema
- Task Cost Classifier - Predicts task type, difficulty, risk
- Model Cascade Router - Dynamic difficulty + ML confirmation + safety floors
- Context Budgeter - Adaptive context allocation by task type
- Cache-Aware Prompt Layout - Prefix-cache reuse optimization
- Tool-Use Cost Gate - Skip/batch/cache tool calls
- Verifier Budgeter - Selective verification (high-risk only)
- Retry/Recovery Optimizer - Failure-specific recovery actions
- Meta-Tool Miner - Compress repeated workflows
- Doom Detector - Early termination for failing runs
Router Architecture (v8)
1. Dynamic difficulty = base(task_type) + adjust(request_keywords)
2. base_tier = min(difficulty + 1, 5)
3. base_tier = max(base_tier, TASK_FLOOR[task_type])
4. If P(success@base_tier) < 0.30 → ESCALATE (safety net)
5. If P(success@tier-1) >= 0.90 → DOWNGRADE (cost saver)
6. Never below floor, never above 5
Per-task safety floors prevent unsafe cheap-model routing on critical tasks.
License
MIT
Generated by ML Intern
This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.
- Try ML Intern: https://smolagents-ml-intern.hf.space
- Source code: https://github.com/huggingface/ml-intern
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = 'narcolepticchicken/agent-cost-optimizer'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.