--- tags: - ml-intern --- # ACO: Agent Cost Optimizer A universal control layer that bolts onto any agent harness to reduce total cost while preserving task quality. ## Quick Start ```bash pip install -e . aco route "Debug this critical production bug" aco budget "Research transformer advances" aco gate web_search --task-type research aco verify --risk high --confidence 0.7 aco stats aco version ``` ## Results On 2,000 synthetic traces across 9 task types: | Router | Success | AvgCost | CostRed | |--------|---------|---------|---------| | always_frontier | 91.0% | $1.04 | baseline | | heuristic | 84.5% | $0.92 | 11.6% | | **ACO v8** | **79.6%** | **$0.78** | **25.3%** | | always_cheap | 29.8% | $0.07 | 93.1% | Key: ACO achieves 25% cost reduction. The verifier budgeter alone eliminates 88% of unnecessary verifications (238/2000 vs 2000/2000). ## The 10 Modules 1. **Cost Telemetry Collector** - Normalized JSON trace schema 2. **Task Cost Classifier** - Predicts task type, difficulty, risk 3. **Model Cascade Router** - Dynamic difficulty + ML confirmation + safety floors 4. **Context Budgeter** - Adaptive context allocation by task type 5. **Cache-Aware Prompt Layout** - Prefix-cache reuse optimization 6. **Tool-Use Cost Gate** - Skip/batch/cache tool calls 7. **Verifier Budgeter** - Selective verification (high-risk only) 8. **Retry/Recovery Optimizer** - Failure-specific recovery actions 9. **Meta-Tool Miner** - Compress repeated workflows 10. **Doom Detector** - Early termination for failing runs ## Router Architecture (v8) ``` 1. Dynamic difficulty = base(task_type) + adjust(request_keywords) 2. base_tier = min(difficulty + 1, 5) 3. base_tier = max(base_tier, TASK_FLOOR[task_type]) 4. If P(success@base_tier) < 0.30 → ESCALATE (safety net) 5. If P(success@tier-1) >= 0.90 → DOWNGRADE (cost saver) 6. Never below floor, never above 5 ``` Per-task safety floors prevent unsafe cheap-model routing on critical tasks. ## License MIT ## Generated by ML Intern This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub. - Try ML Intern: https://smolagents-ml-intern.hf.space - Source code: https://github.com/huggingface/ml-intern ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = 'narcolepticchicken/agent-cost-optimizer' tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) ``` For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.