narcolepticchicken
/

agent-cost-optimizer

Safetensors

Model card Files Files and versions

xet

Community

narcolepticchicken commited on about 20 hours ago

Commit

cb22ae6

verified ·

1 Parent(s): 4c6ae13

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +107 -50

README.md CHANGED Viewed

@@ -1,82 +1,139 @@
 ---
 tags:
-- ml-intern
 ---
-# ACO: Agent Cost Optimizer
-A universal control layer that bolts onto any agent harness to reduce total cost while preserving task quality.
-## Quick Start
-```bash
-pip install -e .
-aco route "Debug this critical production bug"
-aco budget "Research transformer advances"
-aco gate web_search --task-type research
-aco verify --risk high --confidence 0.7
-aco stats
-aco version
-```
-## Results
-On 2,000 synthetic traces across 9 task types:
 | Router | Success | AvgCost | CostRed |
 |--------|---------|---------|---------|
 | always_frontier | 91.0% | $1.04 | baseline |
 | heuristic | 84.5% | $0.92 | 11.6% |
 | **ACO v8** | **79.6%** | **$0.78** | **25.3%** |
-| always_cheap | 29.8% | $0.07 | 93.1% |
-Key: ACO achieves 25% cost reduction. The verifier budgeter alone eliminates 88% of unnecessary verifications (238/2000 vs 2000/2000).
-## The 10 Modules
-1. **Cost Telemetry Collector** - Normalized JSON trace schema
-2. **Task Cost Classifier** - Predicts task type, difficulty, risk
-3. **Model Cascade Router** - Dynamic difficulty + ML confirmation + safety floors
-4. **Context Budgeter** - Adaptive context allocation by task type
-5. **Cache-Aware Prompt Layout** - Prefix-cache reuse optimization
-6. **Tool-Use Cost Gate** - Skip/batch/cache tool calls
-7. **Verifier Budgeter** - Selective verification (high-risk only)
-8. **Retry/Recovery Optimizer** - Failure-specific recovery actions
-9. **Meta-Tool Miner** - Compress repeated workflows
-10. **Doom Detector** - Early termination for failing runs
-## Router Architecture (v8)
 ```
-1. Dynamic difficulty = base(task_type) + adjust(request_keywords)
-2. base_tier = min(difficulty + 1, 5)
-3. base_tier = max(base_tier, TASK_FLOOR[task_type])
-4. If P(success@base_tier) < 0.30 → ESCALATE (safety net)
-5. If P(success@tier-1) >= 0.90 → DOWNGRADE (cost saver)
-6. Never below floor, never above 5
 ```
-Per-task safety floors prevent unsafe cheap-model routing on critical tasks.
-## License
-MIT
-<!-- ml-intern-provenance -->
-## Generated by ML Intern
-This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.
-- Try ML Intern: https://smolagents-ml-intern.hf.space
-- Source code: https://github.com/huggingface/ml-intern
-## Usage
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model_id = 'narcolepticchicken/agent-cost-optimizer'
-tokenizer = AutoTokenizer.from_pretrained(model_id)
-model = AutoModelForCausalLM.from_pretrained(model_id)
 ```
-For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.

 ---
+license: mit
+library_name: xgboost
 tags:
+  - agent-cost-optimizer
+  - model-router
+  - cost-aware-inference
+  - cascade-routing
 ---
+# Agent Cost Optimizer (ACO)
+A universal control layer that reduces the cost of autonomous agent runs while preserving task quality.
+## What It Does
+ACO sits in front of any agent harness and makes cost-aware decisions:
+- Which model to use (tiny → frontier → specialist)
+- How much context to include
+- Whether to call tools
+- Whether to verify outputs
+- When to stop failing runs
+- How to recover from errors
+## Architecture
+10 modules working together:
+1. **Cost Telemetry Collector** - Structured trace schema
+2. **Task Cost Classifier** - Predicts type, difficulty, risk
+3. **Model Cascade Router** - Dynamic difficulty + ML confirmation
+4. **Context Budgeter** - Adaptive context allocation
+5. **Cache-Aware Prompt Layout** - Prefix-cache optimization
+6. **Tool-Use Cost Gate** - Skip/batch/cache tool calls
+7. **Verifier Budgeter** - Selective verification
+8. **Retry/Recovery Optimizer** - Failure-specific actions
+9. **Meta-Tool Miner** - Repeated workflow compression
+10. **Doom Detector** - Early termination
+## Results (2K traces, 9 task types)
 | Router | Success | AvgCost | CostRed |
 |--------|---------|---------|---------|
 | always_frontier | 91.0% | $1.04 | baseline |
 | heuristic | 84.5% | $0.92 | 11.6% |
 | **ACO v8** | **79.6%** | **$0.78** | **25.3%** |
+Key: 88% reduction in unnecessary verifications. Context budgeting saves 20-40% tokens on simple tasks.
+## Quick Start
+```python
+from aco.optimizer import ACOOptimizer
+from aco.config import ACOConfig
+opt = ACOOptimizer(ACOConfig(router_model_path="router_models/router_bundle_v8.pkl"))
+# Route a request
+result = opt.start_run("Debug this critical production bug")
+print(result["routing"])  # tier, model_id, confidence, cost_estimate
+# Check context budget
+print(result["context_budget"])  # total_tokens, keep_exact, omit, summarize
+# End the run
+trace = opt.end_run(success=True)
 ```
+## CLI
+```bash
+aco route "Fix a typo in the README"     # → tier 2 (cheap)
+aco route "Debug critical prod bug NOW"  # → tier 5 (specialist)
+aco budget "Research transformer advances"
+aco gate web_search --task-type research
+aco verify --risk high --confidence 0.7
+aco version
 ```
+## Router v8: Dynamic Difficulty + ML
+The router uses:
+1. Dynamic difficulty estimation from request keywords
+2. Per-tier XGBoost success predictors
+3. Isotonic regression calibration
+4. Safety floors per task type (legal→4, coding→3, etc.)
+5. Safety net escalation (P(success) < 0.30)
+6. Cost saver downgrade (P(success@cheaper) ≥ 0.90)
+## Trained Models
+- `router_bundle_v8.pkl` - Production v8 (XGBoost per-tier + calibrators)
+- `router_bundle_v6.pkl` - v6 hybrid baseline
+## Files
+```
+aco/                     - Python package
+  optimizer.py           - Main orchestrator
+  router.py              - Model cascade router
+  classifier.py          - Task cost classifier
+  context_budgeter.py    - Context allocation
+  cache_layout.py        - Prefix-cache optimization
+  tool_gate.py           - Tool-use cost gate
+  verifier_budgeter.py   - Selective verification
+  retry_optimizer.py     - Failure recovery
+  meta_tool_miner.py     - Workflow compression
+  doom_detector.py       - Early termination
+  config.py              - Configuration
+  trace_schema.py        - Normalized trace schema
+  cli.py                 - CLI interface
+router_models/           - Trained XGBoost models
+training/                - Training scripts (v1-v8)
+eval/                    - Benchmark results
+```
+## Limitations
+- Router trained on synthetic data (needs real agent traces)
+- No execution-feedback features yet (highest-impact next step)
+- No real agent benchmarks (SWE-bench, BFCL) yet
+- Quality gap vs always-frontier (79.6% vs 91.0%)
+## Citation
+If you use ACO, please cite:
 ```
+@software{aco2025,
+  title={Agent Cost Optimizer: Universal Control Layer for Autonomous Agents},
+  author={narcolepticchicken},
+  year={2025},
+  url={https://huggingface.co/narcolepticchicken/agent-cost-optimizer}
+}
+```
+## License
+MIT