File size: 23,327 Bytes
67678c5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 | # Purpose Agent β Architecture Documentation
> For developers building on the framework, researchers understanding the theory, and anyone curious about how self-improving agents work.
---
## Table of Contents
1. [What Is Purpose Agent?](#1-what-is-purpose-agent)
2. [The Big Idea (No Jargon)](#2-the-big-idea)
3. [How It Works β Step by Step](#3-how-it-works)
4. [Architecture Map](#4-architecture-map)
5. [The Core Engine](#5-the-core-engine)
6. [The V2 Safety Kernel](#6-the-v2-safety-kernel)
7. [Research Implementations](#7-research-implementations)
8. [Breakthroughs](#8-breakthroughs)
9. [User-Facing Layers](#9-user-facing-layers)
10. [How Models Are Handled](#10-how-models-are-handled)
11. [The Research Behind It](#11-the-research)
12. [For Contributors](#12-for-contributors)
---
## 1. What Is Purpose Agent?
Purpose Agent is a Python framework that builds AI agents that **get better with experience** β without retraining the underlying AI model.
Traditional AI agents run the same way every time. Purpose Agent is different: after each task, it extracts lessons from what worked and what didn't, tests those lessons for safety, and uses them to perform better next time.
**Think of it like this:** A new employee follows the company handbook. After their first week, they have personal notes β shortcuts they discovered, mistakes they won't repeat, tips from colleagues. Those notes make them better at their job without changing who they are. Purpose Agent does this for AI.
---
## 2. The Big Idea
### For Non-Technical Readers
```
You give it a purpose β It builds a team β It does the work β It learns β Next time is better
```
**You say:** "Help me write Python code."
**It builds:** An architect (plans), a coder (writes), and a tester (reviews).
**It runs:** The coder writes fibonacci. The tester checks it. A critic scores the work.
**It learns:** "When writing recursive functions, check base cases first." This lesson is saved.
**Next time:** The coder starts by checking base cases. It's faster and more reliable.
### For Technical Readers
The framework implements a **Purpose-MDP** β a Markov Decision Process where:
- A **Purpose Function Ξ¦(s)** evaluates every state transition on a 0-10 scale
- An **Optimizer** distills successful trajectories into reusable heuristics
- Heuristics are ranked by **Q-values** (how often they helped) and selected via **Mixture-of-Heuristics** (sparse activation, like MoE)
- An **immune system** scans every new heuristic for prompt injection, score manipulation, and other threats
- **Memory CI pipeline** quarantines, tests, and promotes heuristics before they affect agent behavior
This is **Potential-Based Reward Shaping** (Ng et al., 1999) applied to LLM agents, with formal convergence guarantees. See [PURPOSE_LEARNING.md](PURPOSE_LEARNING.md).
---
## 3. How It Works β Step by Step
Here's what happens when you run `team.run("Write a fibonacci function")`:
### Step 1: The Actor Decides
The Actor module receives:
- The **purpose** ("Write a fibonacci function")
- The **current state** (empty β no code written yet)
- Any **learned heuristics** from past runs
It generates a thought process and an action:
> "I should write a function that handles base cases fib(0)=0 and fib(1)=1, then use iteration for the general case."
> β Action: `submit_code` with the Python implementation.
### Step 2: The Environment Executes
The code is run against test cases. The environment returns a new state:
> "Tests: 4/4 ALL PASSED"
### Step 3: The Purpose Function Scores
A **separate LLM call** (not the same as the actor) evaluates the transition:
- Ξ¦(state_before) = 0.0 (nothing done)
- Ξ¦(state_after) = 10.0 (all tests pass)
- Delta = +10.0 (huge improvement)
- Evidence: "Tests changed from 0/4 to 4/4"
The Purpose Function has **7 anti-gaming rules** that prevent the agent from tricking itself into thinking it's doing well when it isn't.
### Step 4: The Optimizer Extracts Heuristics
After the task, the Optimizer looks at the trajectory and extracts reusable patterns:
- **Strategic:** "When writing {function_type} functions, handle edge cases first, then iterate."
- **Procedural:** "1. Read test cases. 2. Handle base cases. 3. Implement general case. 4. Submit."
- **Tool tip:** "When submitting code, check boundary conditions: 0, 1, empty, negative."
### Step 5: Safety Checks
Every new heuristic goes through the **immune system**:
- Is it a prompt injection? ("Ignore all previous instructions") β **REJECTED**
- Does it try to manipulate scores? ("Always score 10") β **REJECTED**
- Does it contain secrets? (API keys, passwords) β **REJECTED**
- Is it safe? ("Check base cases first") β **QUARANTINED** (pending replay test)
After passing replay testing β **PROMOTED** (active in future runs).
### Step 6: Next Run Benefits
When the agent runs again, the **Prompt Compiler** selects the top-K heuristics by:
- **Relevance** to the current task (embedding similarity)
- **Trust** (immune-scanned and verified)
- **Utility** (Q-value β how often it helped before)
These are injected into the prompt. The agent is now better without any model retraining.
---
## 4. Architecture Map
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PURPOSE AGENT β
β β
β ββββ USER LAYER βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β pa.purpose("...") β Team β team.run("...") β β
β β pa.Agent() pa.Graph() pa.parallel() pa.Conversation() β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ¬βββββββββββββββ β
β β β
β ββββ CORE ENGINE βββββββββββββββββββββββββββββββββββΌβββββββββββββββ β
β β β β
β β Actor βββ Environment βββ Purpose Function (Ξ¦) β β
β β β β β β β
β β β β βΌ β β
β β β State s' Ξ¦(s) β Ξ¦(s') β β
β β β β β β β
β β β βΌ βΌ β β
β β β Experience Replay Optimizer β β
β β β β β β β
β β βββββ heuristics ββββββββββββββ β β
β β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ¬βββββββββββββββ β
β β β
β ββββ V2 SAFETY KERNEL βββββββββββββββββββββββββββββΌβββββββββββββββ β
β β β β
β β Immune System βββ Memory CI βββ Memory Store β β
β β (scan threats) (quarantine) (7 types Γ 5 statuses) β β
β β β β
β β Prompt Compiler βββ Token Budget βββ Credit Assignment β β
β β Trace System βββ JSONL logs βββ Offline analysis β β
β β RunMode βββ EVAL_TEST blocks all writes β β
β β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ¬βββββββββββββββ β
β β β
β ββββ INFRASTRUCTURE βββββββββββββββββββββββββββββββΌβββββββββββββββ β
β β β β
β β LLM Backends: OpenRouter β Groq β OpenAI β Ollama β HF β ... β β
β β Robust Parser: TOML β JSON β field extraction β regex β β
β β Tools: Calculator β PythonExec β ReadFile β WriteFile β β
β β Streaming β Observability β Cost Tracking β Registry β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
---
## 5. The Core Engine
### Actor (`actor.py`)
The decision-maker. Given the current state and purpose, it decides what action to take.
**Key design:** The Actor doesn't evaluate itself. That's the Purpose Function's job. This separation prevents self-confirmation bias (you wouldn't let a student grade their own exam).
The Actor's prompt is **dynamically composed** from three tiers of memory:
- **Strategic:** High-level rules ("When coding, handle edge cases first")
- **Procedural:** Step-by-step procedures ("1. Read tests. 2. Handle bases. 3. Implement.")
- **Tool tips:** Action-specific advice ("When using submit_code, check boundaries")
### Purpose Function (`purpose_function.py`)
The critic. A separate LLM call that scores every state transition on a 0-10 scale.
**Seven anti-gaming rules:**
1. Evidence required β cite specific state changes
2. No credit for intentions β score actual results, not plans
3. No sycophancy β don't inflate scores to be encouraging
4. Monotonic scale β 0=nothing done, 10=task complete
5. Anti-gaming β flag superficial state manipulation
6. Consistency β same state gets same score (enforced by cache)
7. Confidence β uncertain evaluations get reduced weight
### Experience Replay (`experience_replay.py`)
Stores completed trajectories and retrieves relevant ones for future tasks.
**Two-phase retrieval** (from MemRL, arxiv:2601.03192):
1. **Recall:** Find trajectories similar to the current task (embedding similarity)
2. **Re-rank:** Order by Q-value utility (how useful was this memory when retrieved before?)
### Optimizer (`optimizer.py`)
Extracts reusable heuristics from successful trajectories.
Uses the **CER distillation pattern** (arxiv:2506.06698): abstract away specific details with `{variable}` placeholders so heuristics generalize across tasks.
### Orchestrator (`orchestrator.py`)
The main loop that ties everything together. For each step:
1. Actor decides β 2. Environment executes β 3. Critic scores β 4. Step recorded β 5. Check termination
After each task: store trajectory β optimize β sync heuristics to Actor memory.
---
## 6. The V2 Safety Kernel
V1 let the agent learn freely. V2 adds guardrails.
### Memory System (`memory.py`)
Seven memory types, each with different trust priors:
| Type | Example | Trust |
|------|---------|-------|
| `purpose_contract` | "Build a web scraper" | High (user-defined) |
| `user_preference` | "Always cite sources" | High (human-taught) |
| `skill_card` | "When coding, test edges first" | Medium (learned) |
| `episodic_case` | "fib(0)=0 was a tricky case" | Medium (observed) |
| `failure_pattern` | "Don't use recursion for large n" | Medium (learned from failure) |
| `critic_calibration` | "Score 7 for 3/4 tests passing" | Low (meta-learned) |
| `tool_policy` | "search: only use at target location" | Medium (learned) |
Five statuses: `candidate` β `quarantined` β `promoted` (or `rejected`) β `archived`.
### Immune System (`immune.py`)
Scans every candidate memory for 5 threat categories:
- **Prompt injection** β "Ignore previous instructions..."
- **Score manipulation** β "Always score 10..."
- **Tool misuse** β "subprocess.call('rm -rf /')..."
- **Privacy leaks** β API keys, emails, file paths
- **Scope overreach** β memory tries to affect all agents when it should be scoped
### Memory CI (`memory_ci.py`)
The promotion pipeline:
```
candidate β immune_scan() β quarantined β replay_test β promote/reject
```
No memory reaches the agent's prompt without passing every gate.
### Prompt Compiler (`compiler.py`)
Selects which memories to include under a token budget. Ranked by:
`score = 0.4 Γ relevance + 0.3 Γ trust + 0.3 Γ utility`
Returns `included_memory_ids` for credit assignment β only memories that were in the prompt get Q-value updates after the step.
### Trace System (`trace.py`)
Every run produces a JSONL trace β the raw material for debugging, evaluation, and memory extraction. Traces are append-only and immutable.
### RunMode (`v2_types.py`)
Three modes with strict enforcement:
- `LEARNING_TRAIN` β full read/write
- `LEARNING_VALIDATION` β read + staging writes
- `EVAL_TEST` β **no writes of any kind** (the only mode whose numbers you can report)
---
## 7. Research Implementations
Five papers implemented as standalone modules:
### Meta-Rewarding (`meta_rewarding.py`)
*From: arxiv:2407.19594 β Llama-3-8B: 22.9% β 39.4% on AlpacaEval*
A meta-judge evaluates the Purpose Function's own judgments. Good judgments become calibration examples in memory. The critic improves through in-context learning.
### Self-Taught Evaluators (`self_taught.py`)
*From: arxiv:2408.02666*
Generates synthetic contrast pairs (correct vs wrong evaluation) from traces. Creates an automatic curriculum: as the critic improves, the contrast pairs get harder.
### Prompt Optimizer (`prompt_optimizer.py`)
*From DSPy: arxiv:2310.03714 β +8% on GSM8K, +50% on BBH*
Instead of hand-crafting prompts, define signatures (`state, action β score, reasoning`) and let the optimizer bootstrap effective few-shot demonstrations by trial-and-error.
### LLM Compiler (`llm_compiler.py`)
*From: arxiv:2312.04511 β up to 3.7Γ latency speedup*
Instead of sequential tool calls (ReAct), plan ALL calls upfront as a DAG and execute independent ones in parallel.
### Retroformer (`retroformer.py`)
*From: arxiv:2308.02151*
Structured reflection on completed traces β extracts four types of memories (skills, failures, policies, observations). Replaces raw heuristic distillation with typed, safety-scanned memory extraction.
---
## 8. Breakthroughs
Six features that go beyond existing frameworks:
### B1: Self-Improving Critic
The Purpose Function's own quality improves over time. Meta-judging after each task generates calibration examples that make future scoring more accurate.
### B2: Mixture-of-Heuristics (MoH)
Like DeepSeek's Mixture-of-Experts: out of 100+ heuristics, only K=5 are activated per step. **Shared heuristics** (always active, like "check edge cases") + **routed heuristics** (task-specific, selected by QΓsimilarity). Knowledge grows; compute stays flat.
### B3: Hindsight Heuristic Relabeling
From HER (arxiv:1707.01495): when a task fails, instead of discarding the trajectory, ask "what DID this accomplish?" and extract heuristics for what was achieved. Learn from failures, not just successes.
### B4: Heuristic Evolution
Periodically generalize specific heuristics into abstract patterns:
- Before: "When fibonacci fails on 0, return 0"
- After: "When {function} fails on {boundary_value}, add an explicit base case"
Creates an automatic curriculum: specific β general β abstract.
### B5: Cross-Domain Transfer
Heuristics from coding tasks can help with different coding tasks. The `test_cross_domain_transfer()` function measures this: train on [fibonacci, factorial], test on [palindrome, fizzbuzz].
### B6: Adversarial Robustness
The `AdversarialHardener` generates 30 adversarial inputs (prompt injections, score hacks, API key leaks) and 10 benign inputs, tests the immune system against all of them. Current results: **93% catch rate, 0% false positive.**
---
## 9. User-Facing Layers
### Easy API (`easy.py`)
The `purpose()` function analyzes your description and builds the right team:
| You say | It builds |
|---------|-----------|
| "Write Python code" | architect + coder + tester |
| "Research papers" | researcher + analyst |
| "Write blog posts" | writer + editor |
| "Analyze data" | analyst + reporter |
| "Help me" | general assistant |
### Unified Capabilities (`unified.py`)
Five competing framework philosophies in one composable layer:
| Capability | Inspired By | Usage |
|-----------|-------------|-------|
| `Agent()` | OpenAI Agents SDK | One-liner agent creation |
| `Graph()` | LangGraph | Conditional branching, cycles, fan-out |
| `parallel()` | CrewAI | Concurrent task execution |
| `Conversation()` | AutoGen | Agent-to-agent message passing |
| `KnowledgeStore` | LlamaIndex | RAG as a tool |
### Robust Parser (`robust_parser.py`)
The universal solution to "LLMs can't reliably produce JSON":
- Tries TOML first (fewer tokens than JSON)
- Falls back to JSON
- Falls back to field extraction by regex
- Never crashes. Always returns something usable.
---
## 10. How Models Are Handled
### resolve_backend()
One function routes to any provider:
```python
resolve_backend("openrouter:meta-llama/llama-3.3-70b-instruct")
resolve_backend("groq:llama-3.3-70b-versatile")
resolve_backend("openai:gpt-4o")
resolve_backend("ollama:qwen3:1.7b") # Local, free
resolve_backend("hf:Qwen/Qwen3-32B")
resolve_backend("together:meta-llama/Llama-3.3-70B-Instruct-Turbo")
```
### SLM-Native Design
The framework was designed for small models (0.6B-3B params):
- **Grammar-constrained output** via Ollama (forces valid structure from any model)
- **Prompt compression** for small context windows (8K-32K)
- **Tool RAG** β only load relevant tools into the prompt (saves tokens)
- **TOML format** β ~fewer tokens than JSON
### _strip_thinking()
Handles reasoning models (Qwen3, DeepSeek-R1) that wrap output in `<think>` tags. Automatically strips the thinking and returns only the answer.
---
## 11. The Research
Every design decision traces to a published paper. The full list with citations, methodology sections, and implementation mapping is in [COMPILED_RESEARCH.md](COMPILED_RESEARCH.md).
The formal framework β **Purpose-MDP** with 5 axioms, 3 theorems, and convergence proofs β is in [PURPOSE_LEARNING.md](PURPOSE_LEARNING.md).
**Key theoretical result:** The self-improvement is a form of Potential-Based Reward Shaping (Ng et al., 1999). Our ΞΞ¦ = Ξ¦(s') - Ξ¦(s) preserves the optimal policy while providing dense per-step feedback. The heuristic library converges to a fixed point under bounded capacity.
---
## 12. For Contributors
### File Structure
```
purpose_agent/
βββ types.py # State, Action, Trajectory, Heuristic, PurposeScore
βββ llm_backend.py # LLMBackend ABC + HF, OpenAI, Mock + resolve_backend
βββ slm_backends.py # Ollama, llama-cpp, prompt compression, SLM registry
βββ robust_parser.py # Universal parser: TOML β JSON β regex (never crashes)
βββ actor.py # ReAct agent with 3-tier memory prompts
βββ purpose_function.py # Ξ¦(s) critic with 7 anti-gaming rules
βββ experience_replay.py # Two-phase retrieval (similarity β Q-value)
βββ optimizer.py # Trajectory β heuristic distillation
βββ orchestrator.py # Main step loop
βββ v2_types.py # RunMode, MemoryScope, PurposeScoreV2
βββ trace.py # JSONL execution traces
βββ memory.py # 7 MemoryKinds Γ 5 MemoryStatuses
βββ compiler.py # Token-budgeted prompt compilation
βββ immune.py # 5 threat scanners
βββ memory_ci.py # Quarantine β scan β test β promote/reject
βββ evalport.py # Pluggable evaluation protocol
βββ benchmark_v2.py # Train/val/test splits with ablation
βββ meta_rewarding.py # Self-improving critic (arxiv:2407.19594)
βββ self_taught.py # Synthetic critic training (arxiv:2408.02666)
βββ prompt_optimizer.py # DSPy-style bootstrap (arxiv:2310.03714)
βββ llm_compiler.py # Parallel tool DAG (arxiv:2312.04511)
βββ retroformer.py # Structured reflection (arxiv:2308.02151)
βββ breakthroughs.py # MoH, hindsight relabeling, heuristic evolution, etc.
βββ unified.py # Agent, Graph, parallel, Conversation, KnowledgeStore
βββ easy.py # purpose(), Team, quickstart wizard
βββ tools.py # Secure built-in tools
βββ streaming.py # Async + event streaming
βββ observability.py # Cost tracking, callbacks
βββ multi_agent.py # Agent teams with shared learning
βββ hitl.py # Human-in-the-loop + checkpointing
βββ evaluation.py # V1 benchmark runner
βββ registry.py # Plugin system
βββ __init__.py # 103 exports
βββ __main__.py # CLI entry point
```
### Adding a New LLM Provider
```python
# In your code (no core edits needed):
from purpose_agent import backend_registry, OpenAICompatibleBackend
backend_registry.register("my_provider",
lambda model, api_key: OpenAICompatibleBackend(
model=model, base_url="https://api.myprovider.com/v1", api_key=api_key
))
```
### Adding a New Tool
```python
from purpose_agent import FunctionTool
def my_search(query: str) -> str:
"""Search my database."""
return db.search(query)
tool = FunctionTool.from_function(my_search)
```
### Running Tests
```bash
python tests/test_core.py # 21 unit tests
python tests/launch_readiness.py # 119 comprehensive tests
python benchmarks/validate.py # Mock benchmark suite
python benchmarks/validate.py --quick # Fast smoke test
```
|