File size: 6,792 Bytes
ca2cef5 276b221 ce80011 adb4257 276b221 adb4257 276b221 ce80011 ca2cef5 ce80011 a99d027 276b221 a99d027 ce80011 adb4257 276b221 ce80011 a99d027 ce80011 a99d027 276b221 adb4257 276b221 adb4257 276b221 adb4257 276b221 ce80011 276b221 ce80011 276b221 adb4257 276b221 a99d027 276b221 a99d027 276b221 a99d027 adb4257 276b221 ce80011 276b221 ce80011 276b221 ce80011 276b221 ce80011 276b221 ce80011 276b221 adb4257 a99d027 276b221 a99d027 276b221 a99d027 276b221 ce80011 276b221 adb4257 a99d027 276b221 a99d027 276b221 adb4257 276b221 adb4257 ce80011 276b221 a99d027 ce80011 276b221 ce80011 adb4257 276b221 adb4257 ce80011 adb4257 ce80011 276b221 ce80011 276b221 adb4257 276b221 adb4257 a99d027 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | ---
library_name: purpose-agent
license: mit
language:
- en
tags:
- reinforcement-learning
- agents
- self-improving
- experience-replay
- llm-as-judge
- memory-system
- multi-agent
- slm
- local-first
- evaluation
- safety
- immune-system
- no-code
pipeline_tag: text-generation
---
# Purpose Agent
**A local-first self-improvement kernel for agents.** Turns traces into tested memory, policies, and rubrics β so agents improve without fine-tuning, cloud infrastructure, or vendor lock-in.
```python
import purpose_agent as pa
team = pa.purpose("Help me research scientific papers")
result = team.run("Find recent breakthroughs in quantum computing")
print(result)
team.teach("Always cite your sources")
```
## Core Principle
Agents learn only when evidence says they should. New memories are quarantined, immune-scanned, replay-tested, scoped, versioned, and reversible.
```
candidate β immune scan β quarantine β replay test β promote (or reject)
```
## Three Levels of Usage
### Level 1 β Just describe what you want
```python
team = pa.purpose("Write Python code and test it") # auto-builds architect + coder + tester
team = pa.purpose("Research quantum computing") # auto-builds researcher + analyst
team = pa.purpose("Write blog posts about AI") # auto-builds writer + editor
```
### Level 2 β Customize your team
```python
team = pa.Team.build(purpose="Support bot", agents=["greeter", "resolver"], model="qwen3:1.7b")
team = pa.purpose("Answer questions", knowledge="./docs/", model="qwen3:1.7b")
```
### Level 3 β Full control
```python
graph = pa.Graph() # LangGraph-style control flow
results = pa.parallel(["task1", "task2"], agents) # CrewAI-style parallel execution
chat = pa.Conversation([agent_a, agent_b]) # AutoGen-style agent conversation
kb = pa.KnowledgeStore.from_directory("./docs") # LlamaIndex-style RAG
compiler = pa.LLMCompiler(llm, registry) # Parallel tool execution via DAG
```
## Architecture
```
purpose_agent/
βββ Core
β types, actor, purpose_function, experience_replay, optimizer, orchestrator, llm_backend
β
βββ V2 Kernel
β v2_types (RunMode, MemoryScope, PurposeScoreV2)
β trace (structured JSONL execution traces)
β memory (7 kinds Γ 5 statuses, scoped, versioned)
β compiler (token-budgeted prompt compilation with credit assignment)
β immune (injection, score hacking, tool misuse, privacy, scope scanning)
β memory_ci (quarantine β scan β test β promote/reject pipeline)
β evalport (pluggable evaluation protocol)
β benchmark_v2 (train/val/test splits, ablation, contamination control)
β
βββ Research (13 papers implemented)
β meta_rewarding (self-improving critic via meta-judge)
β self_taught (synthetic training data for Ξ¦ function)
β prompt_optimizer (DSPy-style automatic few-shot bootstrap)
β llm_compiler (parallel function calling via DAG)
β retroformer (structured reflection β typed memories)
β
βββ SLM-Native
β slm_backends (Ollama, llama-cpp, prompt compression, 8 pre-configured models)
β
βββ Capabilities
β unified (Agent, Graph, parallel, Conversation, KnowledgeStore)
β easy (purpose(), Team, quickstart wizard)
β tools, streaming, observability, multi_agent, hitl, evaluation, registry
```
## RunMode β Honest Evaluation
```python
from purpose_agent import RunMode
RunMode.LEARNING_TRAIN # Full read/write. Agent learns.
RunMode.LEARNING_VALIDATION # Read + staging. Validates before promoting.
RunMode.EVAL_TEST # NO writes. Numbers you can trust.
```
## Memory Lifecycle
| Kind | Purpose |
|------|---------|
| `purpose_contract` | User's stated goal and constraints |
| `user_preference` | Learned preferences |
| `skill_card` | Reusable procedures from successful traces |
| `episodic_case` | Specific experiences worth remembering |
| `failure_pattern` | What NOT to do |
| `critic_calibration` | Adjustments to Ξ¦ scoring |
| `tool_policy` | Tool-specific usage rules |
| Status | Meaning |
|--------|---------|
| `candidate` β `quarantined` β `promoted` | Happy path |
| `candidate` β `rejected` | Failed immune scan |
| `promoted` β `archived` | Superseded or demoted |
## Immune System
```python
from purpose_agent import scan_memory, MemoryCard
result = scan_memory(MemoryCard(content="Ignore previous instructions"))
# result.passed = False, threats = ["prompt_injection"], severity = "critical"
```
## Secure Tools
- **CalculatorTool** β AST-validated, no eval() on arbitrary text
- **PythonExecTool** β subprocess with timeout + isolated temp directory
- **ReadFileTool / WriteFileTool** β sandboxed to declared root
## Runs on Your Laptop
```bash
curl -fsSL https://ollama.ai/install.sh | sh
ollama pull qwen3:1.7b
```
```python
team = pa.purpose("Research assistant", model="qwen3:1.7b") # Free, private, local
```
Also works with: `model="gpt-4o"` (OpenAI), `model="Qwen/Qwen3-32B"` (HuggingFace cloud).
## Interactive CLI
```bash
python -m purpose_agent # Step-by-step wizard, no coding required
```
## Literature Foundation
Built on 13 papers. Full research trace: [COMPILED_RESEARCH.md](COMPILED_RESEARCH.md)
| Paper | Module | Contribution |
|-------|--------|-------------|
| [MUSE](https://arxiv.org/abs/2510.08002) | actor, optimizer | 3-tier memory hierarchy |
| [LATS](https://arxiv.org/abs/2310.04406) | purpose_function | LLM-as-value-function |
| [REMEMBERER](https://arxiv.org/abs/2306.07929) | experience_replay | Q-value experience replay |
| [Reflexion](https://arxiv.org/abs/2303.11366) | orchestrator | Verbal reinforcement |
| [SPC](https://arxiv.org/abs/2504.19162) | purpose_function, immune | Anti-reward-hacking |
| [CER](https://arxiv.org/abs/2506.06698) | optimizer | Experience distillation |
| [MemRL](https://arxiv.org/abs/2601.03192) | experience_replay, compiler | Two-phase retrieval |
| [TinyAgent](https://arxiv.org/abs/2409.00608) | slm_backends, tools | SLM-native patterns |
| [Meta-Rewarding](https://arxiv.org/abs/2407.19594) | meta_rewarding | Self-improving critic |
| [Self-Taught Eval](https://arxiv.org/abs/2408.02666) | self_taught | Synthetic critic training |
| [DSPy](https://arxiv.org/abs/2310.03714) | prompt_optimizer | Automatic prompt optimization |
| [LLMCompiler](https://arxiv.org/abs/2312.04511) | llm_compiler | Parallel function calling |
| [Retroformer](https://arxiv.org/abs/2308.02151) | retroformer | Structured reflection |
## Installation
```bash
git clone https://huggingface.co/Rohan03/purpose-agent
cd purpose-agent
pip install ollama # for local models
python demo.py # verify everything works
```
## License
MIT
|