narcolepticchicken commited on
Commit
cb22ae6
·
verified ·
1 Parent(s): 4c6ae13

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +107 -50
README.md CHANGED
@@ -1,82 +1,139 @@
1
  ---
 
 
2
  tags:
3
- - ml-intern
 
 
 
4
  ---
5
- # ACO: Agent Cost Optimizer
6
 
7
- A universal control layer that bolts onto any agent harness to reduce total cost while preserving task quality.
8
 
9
- ## Quick Start
10
 
11
- ```bash
12
- pip install -e .
13
- aco route "Debug this critical production bug"
14
- aco budget "Research transformer advances"
15
- aco gate web_search --task-type research
16
- aco verify --risk high --confidence 0.7
17
- aco stats
18
- aco version
19
- ```
20
 
21
- ## Results
 
 
 
 
 
 
22
 
23
- On 2,000 synthetic traces across 9 task types:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
  | Router | Success | AvgCost | CostRed |
26
  |--------|---------|---------|---------|
27
  | always_frontier | 91.0% | $1.04 | baseline |
28
  | heuristic | 84.5% | $0.92 | 11.6% |
29
  | **ACO v8** | **79.6%** | **$0.78** | **25.3%** |
30
- | always_cheap | 29.8% | $0.07 | 93.1% |
31
 
32
- Key: ACO achieves 25% cost reduction. The verifier budgeter alone eliminates 88% of unnecessary verifications (238/2000 vs 2000/2000).
33
 
34
- ## The 10 Modules
35
 
36
- 1. **Cost Telemetry Collector** - Normalized JSON trace schema
37
- 2. **Task Cost Classifier** - Predicts task type, difficulty, risk
38
- 3. **Model Cascade Router** - Dynamic difficulty + ML confirmation + safety floors
39
- 4. **Context Budgeter** - Adaptive context allocation by task type
40
- 5. **Cache-Aware Prompt Layout** - Prefix-cache reuse optimization
41
- 6. **Tool-Use Cost Gate** - Skip/batch/cache tool calls
42
- 7. **Verifier Budgeter** - Selective verification (high-risk only)
43
- 8. **Retry/Recovery Optimizer** - Failure-specific recovery actions
44
- 9. **Meta-Tool Miner** - Compress repeated workflows
45
- 10. **Doom Detector** - Early termination for failing runs
46
 
47
- ## Router Architecture (v8)
 
 
48
 
 
 
 
 
 
49
  ```
50
- 1. Dynamic difficulty = base(task_type) + adjust(request_keywords)
51
- 2. base_tier = min(difficulty + 1, 5)
52
- 3. base_tier = max(base_tier, TASK_FLOOR[task_type])
53
- 4. If P(success@base_tier) < 0.30 → ESCALATE (safety net)
54
- 5. If P(success@tier-1) >= 0.90DOWNGRADE (cost saver)
55
- 6. Never below floor, never above 5
 
 
 
 
56
  ```
57
 
58
- Per-task safety floors prevent unsafe cheap-model routing on critical tasks.
59
 
60
- ## License
 
 
 
 
 
 
61
 
62
- MIT
63
 
64
- <!-- ml-intern-provenance -->
65
- ## Generated by ML Intern
66
 
67
- This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.
68
 
69
- - Try ML Intern: https://smolagents-ml-intern.hf.space
70
- - Source code: https://github.com/huggingface/ml-intern
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
 
72
- ## Usage
73
 
74
- ```python
75
- from transformers import AutoModelForCausalLM, AutoTokenizer
 
 
 
 
 
 
76
 
77
- model_id = 'narcolepticchicken/agent-cost-optimizer'
78
- tokenizer = AutoTokenizer.from_pretrained(model_id)
79
- model = AutoModelForCausalLM.from_pretrained(model_id)
80
  ```
 
 
 
 
 
 
 
 
 
81
 
82
- For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.
 
1
  ---
2
+ license: mit
3
+ library_name: xgboost
4
  tags:
5
+ - agent-cost-optimizer
6
+ - model-router
7
+ - cost-aware-inference
8
+ - cascade-routing
9
  ---
 
10
 
11
+ # Agent Cost Optimizer (ACO)
12
 
13
+ A universal control layer that reduces the cost of autonomous agent runs while preserving task quality.
14
 
15
+ ## What It Does
 
 
 
 
 
 
 
 
16
 
17
+ ACO sits in front of any agent harness and makes cost-aware decisions:
18
+ - Which model to use (tiny → frontier → specialist)
19
+ - How much context to include
20
+ - Whether to call tools
21
+ - Whether to verify outputs
22
+ - When to stop failing runs
23
+ - How to recover from errors
24
 
25
+ ## Architecture
26
+
27
+ 10 modules working together:
28
+
29
+ 1. **Cost Telemetry Collector** - Structured trace schema
30
+ 2. **Task Cost Classifier** - Predicts type, difficulty, risk
31
+ 3. **Model Cascade Router** - Dynamic difficulty + ML confirmation
32
+ 4. **Context Budgeter** - Adaptive context allocation
33
+ 5. **Cache-Aware Prompt Layout** - Prefix-cache optimization
34
+ 6. **Tool-Use Cost Gate** - Skip/batch/cache tool calls
35
+ 7. **Verifier Budgeter** - Selective verification
36
+ 8. **Retry/Recovery Optimizer** - Failure-specific actions
37
+ 9. **Meta-Tool Miner** - Repeated workflow compression
38
+ 10. **Doom Detector** - Early termination
39
+
40
+ ## Results (2K traces, 9 task types)
41
 
42
  | Router | Success | AvgCost | CostRed |
43
  |--------|---------|---------|---------|
44
  | always_frontier | 91.0% | $1.04 | baseline |
45
  | heuristic | 84.5% | $0.92 | 11.6% |
46
  | **ACO v8** | **79.6%** | **$0.78** | **25.3%** |
 
47
 
48
+ Key: 88% reduction in unnecessary verifications. Context budgeting saves 20-40% tokens on simple tasks.
49
 
50
+ ## Quick Start
51
 
52
+ ```python
53
+ from aco.optimizer import ACOOptimizer
54
+ from aco.config import ACOConfig
55
+
56
+ opt = ACOOptimizer(ACOConfig(router_model_path="router_models/router_bundle_v8.pkl"))
 
 
 
 
 
57
 
58
+ # Route a request
59
+ result = opt.start_run("Debug this critical production bug")
60
+ print(result["routing"]) # tier, model_id, confidence, cost_estimate
61
 
62
+ # Check context budget
63
+ print(result["context_budget"]) # total_tokens, keep_exact, omit, summarize
64
+
65
+ # End the run
66
+ trace = opt.end_run(success=True)
67
  ```
68
+
69
+ ## CLI
70
+
71
+ ```bash
72
+ aco route "Fix a typo in the README" # tier 2 (cheap)
73
+ aco route "Debug critical prod bug NOW" # → tier 5 (specialist)
74
+ aco budget "Research transformer advances"
75
+ aco gate web_search --task-type research
76
+ aco verify --risk high --confidence 0.7
77
+ aco version
78
  ```
79
 
80
+ ## Router v8: Dynamic Difficulty + ML
81
 
82
+ The router uses:
83
+ 1. Dynamic difficulty estimation from request keywords
84
+ 2. Per-tier XGBoost success predictors
85
+ 3. Isotonic regression calibration
86
+ 4. Safety floors per task type (legal→4, coding→3, etc.)
87
+ 5. Safety net escalation (P(success) < 0.30)
88
+ 6. Cost saver downgrade (P(success@cheaper) ≥ 0.90)
89
 
90
+ ## Trained Models
91
 
92
+ - `router_bundle_v8.pkl` - Production v8 (XGBoost per-tier + calibrators)
93
+ - `router_bundle_v6.pkl` - v6 hybrid baseline
94
 
95
+ ## Files
96
 
97
+ ```
98
+ aco/ - Python package
99
+ optimizer.py - Main orchestrator
100
+ router.py - Model cascade router
101
+ classifier.py - Task cost classifier
102
+ context_budgeter.py - Context allocation
103
+ cache_layout.py - Prefix-cache optimization
104
+ tool_gate.py - Tool-use cost gate
105
+ verifier_budgeter.py - Selective verification
106
+ retry_optimizer.py - Failure recovery
107
+ meta_tool_miner.py - Workflow compression
108
+ doom_detector.py - Early termination
109
+ config.py - Configuration
110
+ trace_schema.py - Normalized trace schema
111
+ cli.py - CLI interface
112
+ router_models/ - Trained XGBoost models
113
+ training/ - Training scripts (v1-v8)
114
+ eval/ - Benchmark results
115
+ ```
116
 
117
+ ## Limitations
118
 
119
+ - Router trained on synthetic data (needs real agent traces)
120
+ - No execution-feedback features yet (highest-impact next step)
121
+ - No real agent benchmarks (SWE-bench, BFCL) yet
122
+ - Quality gap vs always-frontier (79.6% vs 91.0%)
123
+
124
+ ## Citation
125
+
126
+ If you use ACO, please cite:
127
 
 
 
 
128
  ```
129
+ @software{aco2025,
130
+ title={Agent Cost Optimizer: Universal Control Layer for Autonomous Agents},
131
+ author={narcolepticchicken},
132
+ year={2025},
133
+ url={https://huggingface.co/narcolepticchicken/agent-cost-optimizer}
134
+ }
135
+ ```
136
+
137
+ ## License
138
 
139
+ MIT