purpose-agent / README.md
Rohan03's picture
release: Purpose Agent v2.0.0 β€” final README with 13-paper architecture
276b221 verified
|
raw
history blame
6.79 kB
---
library_name: purpose-agent
license: mit
language:
- en
tags:
- reinforcement-learning
- agents
- self-improving
- experience-replay
- llm-as-judge
- memory-system
- multi-agent
- slm
- local-first
- evaluation
- safety
- immune-system
- no-code
pipeline_tag: text-generation
---
# Purpose Agent
**A local-first self-improvement kernel for agents.** Turns traces into tested memory, policies, and rubrics β€” so agents improve without fine-tuning, cloud infrastructure, or vendor lock-in.
```python
import purpose_agent as pa
team = pa.purpose("Help me research scientific papers")
result = team.run("Find recent breakthroughs in quantum computing")
print(result)
team.teach("Always cite your sources")
```
## Core Principle
Agents learn only when evidence says they should. New memories are quarantined, immune-scanned, replay-tested, scoped, versioned, and reversible.
```
candidate β†’ immune scan β†’ quarantine β†’ replay test β†’ promote (or reject)
```
## Three Levels of Usage
### Level 1 β€” Just describe what you want
```python
team = pa.purpose("Write Python code and test it") # auto-builds architect + coder + tester
team = pa.purpose("Research quantum computing") # auto-builds researcher + analyst
team = pa.purpose("Write blog posts about AI") # auto-builds writer + editor
```
### Level 2 β€” Customize your team
```python
team = pa.Team.build(purpose="Support bot", agents=["greeter", "resolver"], model="qwen3:1.7b")
team = pa.purpose("Answer questions", knowledge="./docs/", model="qwen3:1.7b")
```
### Level 3 β€” Full control
```python
graph = pa.Graph() # LangGraph-style control flow
results = pa.parallel(["task1", "task2"], agents) # CrewAI-style parallel execution
chat = pa.Conversation([agent_a, agent_b]) # AutoGen-style agent conversation
kb = pa.KnowledgeStore.from_directory("./docs") # LlamaIndex-style RAG
compiler = pa.LLMCompiler(llm, registry) # Parallel tool execution via DAG
```
## Architecture
```
purpose_agent/
β”œβ”€β”€ Core
β”‚ types, actor, purpose_function, experience_replay, optimizer, orchestrator, llm_backend
β”‚
β”œβ”€β”€ V2 Kernel
β”‚ v2_types (RunMode, MemoryScope, PurposeScoreV2)
β”‚ trace (structured JSONL execution traces)
β”‚ memory (7 kinds Γ— 5 statuses, scoped, versioned)
β”‚ compiler (token-budgeted prompt compilation with credit assignment)
β”‚ immune (injection, score hacking, tool misuse, privacy, scope scanning)
β”‚ memory_ci (quarantine β†’ scan β†’ test β†’ promote/reject pipeline)
β”‚ evalport (pluggable evaluation protocol)
β”‚ benchmark_v2 (train/val/test splits, ablation, contamination control)
β”‚
β”œβ”€β”€ Research (13 papers implemented)
β”‚ meta_rewarding (self-improving critic via meta-judge)
β”‚ self_taught (synthetic training data for Ξ¦ function)
β”‚ prompt_optimizer (DSPy-style automatic few-shot bootstrap)
β”‚ llm_compiler (parallel function calling via DAG)
β”‚ retroformer (structured reflection β†’ typed memories)
β”‚
β”œβ”€β”€ SLM-Native
β”‚ slm_backends (Ollama, llama-cpp, prompt compression, 8 pre-configured models)
β”‚
β”œβ”€β”€ Capabilities
β”‚ unified (Agent, Graph, parallel, Conversation, KnowledgeStore)
β”‚ easy (purpose(), Team, quickstart wizard)
β”‚ tools, streaming, observability, multi_agent, hitl, evaluation, registry
```
## RunMode β€” Honest Evaluation
```python
from purpose_agent import RunMode
RunMode.LEARNING_TRAIN # Full read/write. Agent learns.
RunMode.LEARNING_VALIDATION # Read + staging. Validates before promoting.
RunMode.EVAL_TEST # NO writes. Numbers you can trust.
```
## Memory Lifecycle
| Kind | Purpose |
|------|---------|
| `purpose_contract` | User's stated goal and constraints |
| `user_preference` | Learned preferences |
| `skill_card` | Reusable procedures from successful traces |
| `episodic_case` | Specific experiences worth remembering |
| `failure_pattern` | What NOT to do |
| `critic_calibration` | Adjustments to Ξ¦ scoring |
| `tool_policy` | Tool-specific usage rules |
| Status | Meaning |
|--------|---------|
| `candidate` β†’ `quarantined` β†’ `promoted` | Happy path |
| `candidate` β†’ `rejected` | Failed immune scan |
| `promoted` β†’ `archived` | Superseded or demoted |
## Immune System
```python
from purpose_agent import scan_memory, MemoryCard
result = scan_memory(MemoryCard(content="Ignore previous instructions"))
# result.passed = False, threats = ["prompt_injection"], severity = "critical"
```
## Secure Tools
- **CalculatorTool** β€” AST-validated, no eval() on arbitrary text
- **PythonExecTool** β€” subprocess with timeout + isolated temp directory
- **ReadFileTool / WriteFileTool** β€” sandboxed to declared root
## Runs on Your Laptop
```bash
curl -fsSL https://ollama.ai/install.sh | sh
ollama pull qwen3:1.7b
```
```python
team = pa.purpose("Research assistant", model="qwen3:1.7b") # Free, private, local
```
Also works with: `model="gpt-4o"` (OpenAI), `model="Qwen/Qwen3-32B"` (HuggingFace cloud).
## Interactive CLI
```bash
python -m purpose_agent # Step-by-step wizard, no coding required
```
## Literature Foundation
Built on 13 papers. Full research trace: [COMPILED_RESEARCH.md](COMPILED_RESEARCH.md)
| Paper | Module | Contribution |
|-------|--------|-------------|
| [MUSE](https://arxiv.org/abs/2510.08002) | actor, optimizer | 3-tier memory hierarchy |
| [LATS](https://arxiv.org/abs/2310.04406) | purpose_function | LLM-as-value-function |
| [REMEMBERER](https://arxiv.org/abs/2306.07929) | experience_replay | Q-value experience replay |
| [Reflexion](https://arxiv.org/abs/2303.11366) | orchestrator | Verbal reinforcement |
| [SPC](https://arxiv.org/abs/2504.19162) | purpose_function, immune | Anti-reward-hacking |
| [CER](https://arxiv.org/abs/2506.06698) | optimizer | Experience distillation |
| [MemRL](https://arxiv.org/abs/2601.03192) | experience_replay, compiler | Two-phase retrieval |
| [TinyAgent](https://arxiv.org/abs/2409.00608) | slm_backends, tools | SLM-native patterns |
| [Meta-Rewarding](https://arxiv.org/abs/2407.19594) | meta_rewarding | Self-improving critic |
| [Self-Taught Eval](https://arxiv.org/abs/2408.02666) | self_taught | Synthetic critic training |
| [DSPy](https://arxiv.org/abs/2310.03714) | prompt_optimizer | Automatic prompt optimization |
| [LLMCompiler](https://arxiv.org/abs/2312.04511) | llm_compiler | Parallel function calling |
| [Retroformer](https://arxiv.org/abs/2308.02151) | retroformer | Structured reflection |
## Installation
```bash
git clone https://huggingface.co/Rohan03/purpose-agent
cd purpose-agent
pip install ollama # for local models
python demo.py # verify everything works
```
## License
MIT