narcolepticchicken's picture
Update ML Intern artifact metadata
a8770ba verified
|
raw
history blame
2.67 kB
metadata
tags:
  - ml-intern

ACO: Agent Cost Optimizer

A universal control layer that bolts onto any agent harness to reduce total cost while preserving task quality.

Quick Start

pip install -e .
aco route "Debug this critical production bug"
aco budget "Research transformer advances"
aco gate web_search --task-type research
aco verify --risk high --confidence 0.7
aco stats
aco version

Results

On 2,000 synthetic traces across 9 task types:

Router Success AvgCost CostRed
always_frontier 91.0% $1.04 baseline
heuristic 84.5% $0.92 11.6%
ACO v8 79.6% $0.78 25.3%
always_cheap 29.8% $0.07 93.1%

Key: ACO achieves 25% cost reduction. The verifier budgeter alone eliminates 88% of unnecessary verifications (238/2000 vs 2000/2000).

The 10 Modules

  1. Cost Telemetry Collector - Normalized JSON trace schema
  2. Task Cost Classifier - Predicts task type, difficulty, risk
  3. Model Cascade Router - Dynamic difficulty + ML confirmation + safety floors
  4. Context Budgeter - Adaptive context allocation by task type
  5. Cache-Aware Prompt Layout - Prefix-cache reuse optimization
  6. Tool-Use Cost Gate - Skip/batch/cache tool calls
  7. Verifier Budgeter - Selective verification (high-risk only)
  8. Retry/Recovery Optimizer - Failure-specific recovery actions
  9. Meta-Tool Miner - Compress repeated workflows
  10. Doom Detector - Early termination for failing runs

Router Architecture (v8)

1. Dynamic difficulty = base(task_type) + adjust(request_keywords)
2. base_tier = min(difficulty + 1, 5)
3. base_tier = max(base_tier, TASK_FLOOR[task_type])
4. If P(success@base_tier) < 0.30 → ESCALATE (safety net)
5. If P(success@tier-1) >= 0.90 → DOWNGRADE (cost saver)
6. Never below floor, never above 5

Per-task safety floors prevent unsafe cheap-model routing on critical tasks.

License

MIT

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = 'narcolepticchicken/agent-cost-optimizer'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.