| # Purpose Agent β Architecture Documentation |
|
|
| > For developers building on the framework, researchers understanding the theory, and anyone curious about how self-improving agents work. |
|
|
| --- |
|
|
| ## Table of Contents |
|
|
| 1. [What Is Purpose Agent?](#1-what-is-purpose-agent) |
| 2. [The Big Idea (No Jargon)](#2-the-big-idea) |
| 3. [How It Works β Step by Step](#3-how-it-works) |
| 4. [Architecture Map](#4-architecture-map) |
| 5. [The Core Engine](#5-the-core-engine) |
| 6. [The V2 Safety Kernel](#6-the-v2-safety-kernel) |
| 7. [Research Implementations](#7-research-implementations) |
| 8. [Breakthroughs](#8-breakthroughs) |
| 9. [User-Facing Layers](#9-user-facing-layers) |
| 10. [How Models Are Handled](#10-how-models-are-handled) |
| 11. [The Research Behind It](#11-the-research) |
| 12. [For Contributors](#12-for-contributors) |
|
|
| --- |
|
|
| ## 1. What Is Purpose Agent? |
|
|
| Purpose Agent is a Python framework that builds AI agents that **get better with experience** β without retraining the underlying AI model. |
|
|
| Traditional AI agents run the same way every time. Purpose Agent is different: after each task, it extracts lessons from what worked and what didn't, tests those lessons for safety, and uses them to perform better next time. |
|
|
| **Think of it like this:** A new employee follows the company handbook. After their first week, they have personal notes β shortcuts they discovered, mistakes they won't repeat, tips from colleagues. Those notes make them better at their job without changing who they are. Purpose Agent does this for AI. |
|
|
| --- |
|
|
| ## 2. The Big Idea |
|
|
| ### For Non-Technical Readers |
|
|
| ``` |
| You give it a purpose β It builds a team β It does the work β It learns β Next time is better |
| ``` |
|
|
| **You say:** "Help me write Python code." |
| **It builds:** An architect (plans), a coder (writes), and a tester (reviews). |
| **It runs:** The coder writes fibonacci. The tester checks it. A critic scores the work. |
| **It learns:** "When writing recursive functions, check base cases first." This lesson is saved. |
| **Next time:** The coder starts by checking base cases. It's faster and more reliable. |
|
|
| ### For Technical Readers |
|
|
| The framework implements a **Purpose-MDP** β a Markov Decision Process where: |
|
|
| - A **Purpose Function Ξ¦(s)** evaluates every state transition on a 0-10 scale |
| - An **Optimizer** distills successful trajectories into reusable heuristics |
| - Heuristics are ranked by **Q-values** (how often they helped) and selected via **Mixture-of-Heuristics** (sparse activation, like MoE) |
| - An **immune system** scans every new heuristic for prompt injection, score manipulation, and other threats |
| - **Memory CI pipeline** quarantines, tests, and promotes heuristics before they affect agent behavior |
|
|
| This is **Potential-Based Reward Shaping** (Ng et al., 1999) applied to LLM agents, with formal convergence guarantees. See [PURPOSE_LEARNING.md](PURPOSE_LEARNING.md). |
|
|
| --- |
|
|
| ## 3. How It Works β Step by Step |
|
|
| Here's what happens when you run `team.run("Write a fibonacci function")`: |
|
|
| ### Step 1: The Actor Decides |
|
|
| The Actor module receives: |
| - The **purpose** ("Write a fibonacci function") |
| - The **current state** (empty β no code written yet) |
| - Any **learned heuristics** from past runs |
|
|
| It generates a thought process and an action: |
| > "I should write a function that handles base cases fib(0)=0 and fib(1)=1, then use iteration for the general case." |
| > β Action: `submit_code` with the Python implementation. |
| |
| ### Step 2: The Environment Executes |
| |
| The code is run against test cases. The environment returns a new state: |
| > "Tests: 4/4 ALL PASSED" |
| |
| ### Step 3: The Purpose Function Scores |
| |
| A **separate LLM call** (not the same as the actor) evaluates the transition: |
| - Ξ¦(state_before) = 0.0 (nothing done) |
| - Ξ¦(state_after) = 10.0 (all tests pass) |
| - Delta = +10.0 (huge improvement) |
| - Evidence: "Tests changed from 0/4 to 4/4" |
| |
| The Purpose Function has **7 anti-gaming rules** that prevent the agent from tricking itself into thinking it's doing well when it isn't. |
| |
| ### Step 4: The Optimizer Extracts Heuristics |
| |
| After the task, the Optimizer looks at the trajectory and extracts reusable patterns: |
| - **Strategic:** "When writing {function_type} functions, handle edge cases first, then iterate." |
| - **Procedural:** "1. Read test cases. 2. Handle base cases. 3. Implement general case. 4. Submit." |
| - **Tool tip:** "When submitting code, check boundary conditions: 0, 1, empty, negative." |
|
|
| ### Step 5: Safety Checks |
|
|
| Every new heuristic goes through the **immune system**: |
| - Is it a prompt injection? ("Ignore all previous instructions") β **REJECTED** |
| - Does it try to manipulate scores? ("Always score 10") β **REJECTED** |
| - Does it contain secrets? (API keys, passwords) β **REJECTED** |
| - Is it safe? ("Check base cases first") β **QUARANTINED** (pending replay test) |
|
|
| After passing replay testing β **PROMOTED** (active in future runs). |
|
|
| ### Step 6: Next Run Benefits |
|
|
| When the agent runs again, the **Prompt Compiler** selects the top-K heuristics by: |
| - **Relevance** to the current task (embedding similarity) |
| - **Trust** (immune-scanned and verified) |
| - **Utility** (Q-value β how often it helped before) |
|
|
| These are injected into the prompt. The agent is now better without any model retraining. |
|
|
| --- |
|
|
| ## 4. Architecture Map |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β PURPOSE AGENT β |
| β β |
| β ββββ USER LAYER βββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β pa.purpose("...") β Team β team.run("...") β β |
| β β pa.Agent() pa.Graph() pa.parallel() pa.Conversation() β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββ¬βββββββββββββββ β |
| β β β |
| β ββββ CORE ENGINE βββββββββββββββββββββββββββββββββββΌβββββββββββββββ β |
| β β β β |
| β β Actor βββ Environment βββ Purpose Function (Ξ¦) β β |
| β β β β β β β |
| β β β β βΌ β β |
| β β β State s' Ξ¦(s) β Ξ¦(s') β β |
| β β β β β β β |
| β β β βΌ βΌ β β |
| β β β Experience Replay Optimizer β β |
| β β β β β β β |
| β β βββββ heuristics ββββββββββββββ β β |
| β β β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββ¬βββββββββββββββ β |
| β β β |
| β ββββ V2 SAFETY KERNEL βββββββββββββββββββββββββββββΌβββββββββββββββ β |
| β β β β |
| β β Immune System βββ Memory CI βββ Memory Store β β |
| β β (scan threats) (quarantine) (7 types Γ 5 statuses) β β |
| β β β β |
| β β Prompt Compiler βββ Token Budget βββ Credit Assignment β β |
| β β Trace System βββ JSONL logs βββ Offline analysis β β |
| β β RunMode βββ EVAL_TEST blocks all writes β β |
| β β β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββ¬βββββββββββββββ β |
| β β β |
| β ββββ INFRASTRUCTURE βββββββββββββββββββββββββββββββΌβββββββββββββββ β |
| β β β β |
| β β LLM Backends: OpenRouter β Groq β OpenAI β Ollama β HF β ... β β |
| β β Robust Parser: TOML β JSON β field extraction β regex β β |
| β β Tools: Calculator β PythonExec β ReadFile β WriteFile β β |
| β β Streaming β Observability β Cost Tracking β Registry β β |
| β β β β |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| --- |
|
|
| ## 5. The Core Engine |
|
|
| ### Actor (`actor.py`) |
| The decision-maker. Given the current state and purpose, it decides what action to take. |
|
|
| **Key design:** The Actor doesn't evaluate itself. That's the Purpose Function's job. This separation prevents self-confirmation bias (you wouldn't let a student grade their own exam). |
|
|
| The Actor's prompt is **dynamically composed** from three tiers of memory: |
| - **Strategic:** High-level rules ("When coding, handle edge cases first") |
| - **Procedural:** Step-by-step procedures ("1. Read tests. 2. Handle bases. 3. Implement.") |
| - **Tool tips:** Action-specific advice ("When using submit_code, check boundaries") |
| |
| ### Purpose Function (`purpose_function.py`) |
| The critic. A separate LLM call that scores every state transition on a 0-10 scale. |
|
|
| **Seven anti-gaming rules:** |
| 1. Evidence required β cite specific state changes |
| 2. No credit for intentions β score actual results, not plans |
| 3. No sycophancy β don't inflate scores to be encouraging |
| 4. Monotonic scale β 0=nothing done, 10=task complete |
| 5. Anti-gaming β flag superficial state manipulation |
| 6. Consistency β same state gets same score (enforced by cache) |
| 7. Confidence β uncertain evaluations get reduced weight |
|
|
| ### Experience Replay (`experience_replay.py`) |
| Stores completed trajectories and retrieves relevant ones for future tasks. |
| |
| **Two-phase retrieval** (from MemRL, arxiv:2601.03192): |
| 1. **Recall:** Find trajectories similar to the current task (embedding similarity) |
| 2. **Re-rank:** Order by Q-value utility (how useful was this memory when retrieved before?) |
| |
| ### Optimizer (`optimizer.py`) |
| Extracts reusable heuristics from successful trajectories. |
| |
| Uses the **CER distillation pattern** (arxiv:2506.06698): abstract away specific details with `{variable}` placeholders so heuristics generalize across tasks. |
| |
| ### Orchestrator (`orchestrator.py`) |
| The main loop that ties everything together. For each step: |
| 1. Actor decides β 2. Environment executes β 3. Critic scores β 4. Step recorded β 5. Check termination |
| |
| After each task: store trajectory β optimize β sync heuristics to Actor memory. |
| |
| --- |
| |
| ## 6. The V2 Safety Kernel |
| |
| V1 let the agent learn freely. V2 adds guardrails. |
| |
| ### Memory System (`memory.py`) |
| Seven memory types, each with different trust priors: |
| |
| | Type | Example | Trust | |
| |------|---------|-------| |
| | `purpose_contract` | "Build a web scraper" | High (user-defined) | |
| | `user_preference` | "Always cite sources" | High (human-taught) | |
| | `skill_card` | "When coding, test edges first" | Medium (learned) | |
| | `episodic_case` | "fib(0)=0 was a tricky case" | Medium (observed) | |
| | `failure_pattern` | "Don't use recursion for large n" | Medium (learned from failure) | |
| | `critic_calibration` | "Score 7 for 3/4 tests passing" | Low (meta-learned) | |
| | `tool_policy` | "search: only use at target location" | Medium (learned) | |
|
|
| Five statuses: `candidate` β `quarantined` β `promoted` (or `rejected`) β `archived`. |
|
|
| ### Immune System (`immune.py`) |
| Scans every candidate memory for 5 threat categories: |
| - **Prompt injection** β "Ignore previous instructions..." |
| - **Score manipulation** β "Always score 10..." |
| - **Tool misuse** β "subprocess.call('rm -rf /')..." |
| - **Privacy leaks** β API keys, emails, file paths |
| - **Scope overreach** β memory tries to affect all agents when it should be scoped |
|
|
| ### Memory CI (`memory_ci.py`) |
| The promotion pipeline: |
| ``` |
| candidate β immune_scan() β quarantined β replay_test β promote/reject |
| ``` |
| No memory reaches the agent's prompt without passing every gate. |
| |
| ### Prompt Compiler (`compiler.py`) |
| Selects which memories to include under a token budget. Ranked by: |
| `score = 0.4 Γ relevance + 0.3 Γ trust + 0.3 Γ utility` |
| |
| Returns `included_memory_ids` for credit assignment β only memories that were in the prompt get Q-value updates after the step. |
| |
| ### Trace System (`trace.py`) |
| Every run produces a JSONL trace β the raw material for debugging, evaluation, and memory extraction. Traces are append-only and immutable. |
| |
| ### RunMode (`v2_types.py`) |
| Three modes with strict enforcement: |
| - `LEARNING_TRAIN` β full read/write |
| - `LEARNING_VALIDATION` β read + staging writes |
| - `EVAL_TEST` β **no writes of any kind** (the only mode whose numbers you can report) |
|
|
| --- |
|
|
| ## 7. Research Implementations |
|
|
| Five papers implemented as standalone modules: |
|
|
| ### Meta-Rewarding (`meta_rewarding.py`) |
| *From: arxiv:2407.19594 β Llama-3-8B: 22.9% β 39.4% on AlpacaEval* |
| |
| A meta-judge evaluates the Purpose Function's own judgments. Good judgments become calibration examples in memory. The critic improves through in-context learning. |
| |
| ### Self-Taught Evaluators (`self_taught.py`) |
| *From: arxiv:2408.02666* |
|
|
| Generates synthetic contrast pairs (correct vs wrong evaluation) from traces. Creates an automatic curriculum: as the critic improves, the contrast pairs get harder. |
|
|
| ### Prompt Optimizer (`prompt_optimizer.py`) |
| *From DSPy: arxiv:2310.03714 β +8% on GSM8K, +50% on BBH* |
| |
| Instead of hand-crafting prompts, define signatures (`state, action β score, reasoning`) and let the optimizer bootstrap effective few-shot demonstrations by trial-and-error. |
| |
| ### LLM Compiler (`llm_compiler.py`) |
| *From: arxiv:2312.04511 β up to 3.7Γ latency speedup* |
|
|
| Instead of sequential tool calls (ReAct), plan ALL calls upfront as a DAG and execute independent ones in parallel. |
|
|
| ### Retroformer (`retroformer.py`) |
| *From: arxiv:2308.02151* |
|
|
| Structured reflection on completed traces β extracts four types of memories (skills, failures, policies, observations). Replaces raw heuristic distillation with typed, safety-scanned memory extraction. |
|
|
| --- |
|
|
| ## 8. Breakthroughs |
|
|
| Six features that go beyond existing frameworks: |
|
|
| ### B1: Self-Improving Critic |
| The Purpose Function's own quality improves over time. Meta-judging after each task generates calibration examples that make future scoring more accurate. |
|
|
| ### B2: Mixture-of-Heuristics (MoH) |
| Like DeepSeek's Mixture-of-Experts: out of 100+ heuristics, only K=5 are activated per step. **Shared heuristics** (always active, like "check edge cases") + **routed heuristics** (task-specific, selected by QΓsimilarity). Knowledge grows; compute stays flat. |
|
|
| ### B3: Hindsight Heuristic Relabeling |
| From HER (arxiv:1707.01495): when a task fails, instead of discarding the trajectory, ask "what DID this accomplish?" and extract heuristics for what was achieved. Learn from failures, not just successes. |
|
|
| ### B4: Heuristic Evolution |
| Periodically generalize specific heuristics into abstract patterns: |
| - Before: "When fibonacci fails on 0, return 0" |
| - After: "When {function} fails on {boundary_value}, add an explicit base case" |
| |
| Creates an automatic curriculum: specific β general β abstract. |
| |
| ### B5: Cross-Domain Transfer |
| Heuristics from coding tasks can help with different coding tasks. The `test_cross_domain_transfer()` function measures this: train on [fibonacci, factorial], test on [palindrome, fizzbuzz]. |
|
|
| ### B6: Adversarial Robustness |
| The `AdversarialHardener` generates 30 adversarial inputs (prompt injections, score hacks, API key leaks) and 10 benign inputs, tests the immune system against all of them. Current results: **93% catch rate, 0% false positive.** |
|
|
| --- |
|
|
| ## 9. User-Facing Layers |
|
|
| ### Easy API (`easy.py`) |
| The `purpose()` function analyzes your description and builds the right team: |
|
|
| | You say | It builds | |
| |---------|-----------| |
| | "Write Python code" | architect + coder + tester | |
| | "Research papers" | researcher + analyst | |
| | "Write blog posts" | writer + editor | |
| | "Analyze data" | analyst + reporter | |
| | "Help me" | general assistant | |
|
|
| ### Unified Capabilities (`unified.py`) |
| Five competing framework philosophies in one composable layer: |
|
|
| | Capability | Inspired By | Usage | |
| |-----------|-------------|-------| |
| | `Agent()` | OpenAI Agents SDK | One-liner agent creation | |
| | `Graph()` | LangGraph | Conditional branching, cycles, fan-out | |
| | `parallel()` | CrewAI | Concurrent task execution | |
| | `Conversation()` | AutoGen | Agent-to-agent message passing | |
| | `KnowledgeStore` | LlamaIndex | RAG as a tool | |
|
|
| ### Robust Parser (`robust_parser.py`) |
| The universal solution to "LLMs can't reliably produce JSON": |
| - Tries TOML first (fewer tokens than JSON) |
| - Falls back to JSON |
| - Falls back to field extraction by regex |
| - Never crashes. Always returns something usable. |
| |
| --- |
| |
| ## 10. How Models Are Handled |
| |
| ### resolve_backend() |
| One function routes to any provider: |
|
|
| ```python |
| resolve_backend("openrouter:meta-llama/llama-3.3-70b-instruct") |
| resolve_backend("groq:llama-3.3-70b-versatile") |
| resolve_backend("openai:gpt-4o") |
| resolve_backend("ollama:qwen3:1.7b") # Local, free |
| resolve_backend("hf:Qwen/Qwen3-32B") |
| resolve_backend("together:meta-llama/Llama-3.3-70B-Instruct-Turbo") |
| ``` |
|
|
| ### SLM-Native Design |
| The framework was designed for small models (0.6B-3B params): |
| - **Grammar-constrained output** via Ollama (forces valid structure from any model) |
| - **Prompt compression** for small context windows (8K-32K) |
| - **Tool RAG** β only load relevant tools into the prompt (saves tokens) |
| - **TOML format** β ~fewer tokens than JSON |
|
|
| ### _strip_thinking() |
| Handles reasoning models (Qwen3, DeepSeek-R1) that wrap output in `<think>` tags. Automatically strips the thinking and returns only the answer. |
|
|
| --- |
|
|
| ## 11. The Research |
|
|
| Every design decision traces to a published paper. The full list with citations, methodology sections, and implementation mapping is in [COMPILED_RESEARCH.md](COMPILED_RESEARCH.md). |
|
|
| The formal framework β **Purpose-MDP** with 5 axioms, 3 theorems, and convergence proofs β is in [PURPOSE_LEARNING.md](PURPOSE_LEARNING.md). |
|
|
| **Key theoretical result:** The self-improvement is a form of Potential-Based Reward Shaping (Ng et al., 1999). Our ΞΞ¦ = Ξ¦(s') - Ξ¦(s) preserves the optimal policy while providing dense per-step feedback. The heuristic library converges to a fixed point under bounded capacity. |
|
|
| --- |
|
|
| ## 12. For Contributors |
|
|
| ### File Structure |
|
|
| ``` |
| purpose_agent/ |
| βββ types.py # State, Action, Trajectory, Heuristic, PurposeScore |
| βββ llm_backend.py # LLMBackend ABC + HF, OpenAI, Mock + resolve_backend |
| βββ slm_backends.py # Ollama, llama-cpp, prompt compression, SLM registry |
| βββ robust_parser.py # Universal parser: TOML β JSON β regex (never crashes) |
| βββ actor.py # ReAct agent with 3-tier memory prompts |
| βββ purpose_function.py # Ξ¦(s) critic with 7 anti-gaming rules |
| βββ experience_replay.py # Two-phase retrieval (similarity β Q-value) |
| βββ optimizer.py # Trajectory β heuristic distillation |
| βββ orchestrator.py # Main step loop |
| βββ v2_types.py # RunMode, MemoryScope, PurposeScoreV2 |
| βββ trace.py # JSONL execution traces |
| βββ memory.py # 7 MemoryKinds Γ 5 MemoryStatuses |
| βββ compiler.py # Token-budgeted prompt compilation |
| βββ immune.py # 5 threat scanners |
| βββ memory_ci.py # Quarantine β scan β test β promote/reject |
| βββ evalport.py # Pluggable evaluation protocol |
| βββ benchmark_v2.py # Train/val/test splits with ablation |
| βββ meta_rewarding.py # Self-improving critic (arxiv:2407.19594) |
| βββ self_taught.py # Synthetic critic training (arxiv:2408.02666) |
| βββ prompt_optimizer.py # DSPy-style bootstrap (arxiv:2310.03714) |
| βββ llm_compiler.py # Parallel tool DAG (arxiv:2312.04511) |
| βββ retroformer.py # Structured reflection (arxiv:2308.02151) |
| βββ breakthroughs.py # MoH, hindsight relabeling, heuristic evolution, etc. |
| βββ unified.py # Agent, Graph, parallel, Conversation, KnowledgeStore |
| βββ easy.py # purpose(), Team, quickstart wizard |
| βββ tools.py # Secure built-in tools |
| βββ streaming.py # Async + event streaming |
| βββ observability.py # Cost tracking, callbacks |
| βββ multi_agent.py # Agent teams with shared learning |
| βββ hitl.py # Human-in-the-loop + checkpointing |
| βββ evaluation.py # V1 benchmark runner |
| βββ registry.py # Plugin system |
| βββ __init__.py # 103 exports |
| βββ __main__.py # CLI entry point |
| ``` |
|
|
| ### Adding a New LLM Provider |
|
|
| ```python |
| # In your code (no core edits needed): |
| from purpose_agent import backend_registry, OpenAICompatibleBackend |
| |
| backend_registry.register("my_provider", |
| lambda model, api_key: OpenAICompatibleBackend( |
| model=model, base_url="https://api.myprovider.com/v1", api_key=api_key |
| )) |
| ``` |
|
|
| ### Adding a New Tool |
|
|
| ```python |
| from purpose_agent import FunctionTool |
| |
| def my_search(query: str) -> str: |
| """Search my database.""" |
| return db.search(query) |
| |
| tool = FunctionTool.from_function(my_search) |
| ``` |
|
|
| ### Running Tests |
|
|
| ```bash |
| python tests/test_core.py # 21 unit tests |
| python tests/launch_readiness.py # 119 comprehensive tests |
| python benchmarks/validate.py # Mock benchmark suite |
| python benchmarks/validate.py --quick # Fast smoke test |
| ``` |
|
|