narcolepticchicken
/

agent-cost-optimizer

Model card Files Files and versions

agent-cost-optimizer / README.md

narcolepticchicken's picture

narcolepticchicken

Update ML Intern artifact metadata

a8770ba verified 4 days ago

|

2.67 kB

	---
	tags:
	- ml-intern
	---
	# ACO: Agent Cost Optimizer

	A universal control layer that bolts onto any agent harness to reduce total cost while preserving task quality.

	## Quick Start

	```bash
	pip install -e .
	aco route "Debug this critical production bug"
	aco budget "Research transformer advances"
	aco gate web_search --task-type research
	aco verify --risk high --confidence 0.7
	aco stats
	aco version
	```

	## Results

	On 2,000 synthetic traces across 9 task types:

	\| Router \| Success \| AvgCost \| CostRed \|
	\|--------\|---------\|---------\|---------\|
	\| always_frontier \| 91.0% \| $1.04 \| baseline \|
	\| heuristic \| 84.5% \| $0.92 \| 11.6% \|
	\| ACO v8 \| 79.6% \| $0.78 \| 25.3% \|
	\| always_cheap \| 29.8% \| $0.07 \| 93.1% \|

	Key: ACO achieves 25% cost reduction. The verifier budgeter alone eliminates 88% of unnecessary verifications (238/2000 vs 2000/2000).

	## The 10 Modules

	1. Cost Telemetry Collector - Normalized JSON trace schema
	2. Task Cost Classifier - Predicts task type, difficulty, risk
	3. Model Cascade Router - Dynamic difficulty + ML confirmation + safety floors
	4. Context Budgeter - Adaptive context allocation by task type
	5. Cache-Aware Prompt Layout - Prefix-cache reuse optimization
	6. Tool-Use Cost Gate - Skip/batch/cache tool calls
	7. Verifier Budgeter - Selective verification (high-risk only)
	8. Retry/Recovery Optimizer - Failure-specific recovery actions
	9. Meta-Tool Miner - Compress repeated workflows
	10. Doom Detector - Early termination for failing runs

	## Router Architecture (v8)

	```
	1. Dynamic difficulty = base(task_type) + adjust(request_keywords)
	2. base_tier = min(difficulty + 1, 5)
	3. base_tier = max(base_tier, TASK_FLOOR[task_type])
	4. If P(success@base_tier) < 0.30 → ESCALATE (safety net)
	5. If P(success@tier-1) >= 0.90 → DOWNGRADE (cost saver)
	6. Never below floor, never above 5
	```

	Per-task safety floors prevent unsafe cheap-model routing on critical tasks.

	## License

	MIT

	<!-- ml-intern-provenance -->
	## Generated by ML Intern

	This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.

	- Try ML Intern: https://smolagents-ml-intern.hf.space
	- Source code: https://github.com/huggingface/ml-intern

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = 'narcolepticchicken/agent-cost-optimizer'
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id)
	```

	For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.