release: Purpose Agent v2.0.0 — final README with 13-paper architecture

276b221 verified 15 days ago

6.79 kB

	---
	library_name: purpose-agent
	license: mit
	language:
	- en
	tags:
	- reinforcement-learning
	- agents
	- self-improving
	- experience-replay
	- llm-as-judge
	- memory-system
	- multi-agent
	- slm
	- local-first
	- evaluation
	- safety
	- immune-system
	- no-code
	pipeline_tag: text-generation
	---

	# Purpose Agent

	A local-first self-improvement kernel for agents. Turns traces into tested memory, policies, and rubrics — so agents improve without fine-tuning, cloud infrastructure, or vendor lock-in.

	```python
	import purpose_agent as pa

	team = pa.purpose("Help me research scientific papers")
	result = team.run("Find recent breakthroughs in quantum computing")
	print(result)

	team.teach("Always cite your sources")
	```

	## Core Principle

	Agents learn only when evidence says they should. New memories are quarantined, immune-scanned, replay-tested, scoped, versioned, and reversible.

	```
	candidate → immune scan → quarantine → replay test → promote (or reject)
	```

	## Three Levels of Usage

	### Level 1 — Just describe what you want

	```python
	team = pa.purpose("Write Python code and test it") # auto-builds architect + coder + tester
	team = pa.purpose("Research quantum computing") # auto-builds researcher + analyst
	team = pa.purpose("Write blog posts about AI") # auto-builds writer + editor
	```

	### Level 2 — Customize your team

	```python
	team = pa.Team.build(purpose="Support bot", agents=["greeter", "resolver"], model="qwen3:1.7b")
	team = pa.purpose("Answer questions", knowledge="./docs/", model="qwen3:1.7b")
	```

	### Level 3 — Full control

	```python
	graph = pa.Graph() # LangGraph-style control flow
	results = pa.parallel(["task1", "task2"], agents) # CrewAI-style parallel execution
	chat = pa.Conversation([agent_a, agent_b]) # AutoGen-style agent conversation
	kb = pa.KnowledgeStore.from_directory("./docs") # LlamaIndex-style RAG
	compiler = pa.LLMCompiler(llm, registry) # Parallel tool execution via DAG
	```

	## Architecture

	```
	purpose_agent/
	├── Core
	│ types, actor, purpose_function, experience_replay, optimizer, orchestrator, llm_backend
	│
	├── V2 Kernel
	│ v2_types (RunMode, MemoryScope, PurposeScoreV2)
	│ trace (structured JSONL execution traces)
	│ memory (7 kinds × 5 statuses, scoped, versioned)
	│ compiler (token-budgeted prompt compilation with credit assignment)
	│ immune (injection, score hacking, tool misuse, privacy, scope scanning)
	│ memory_ci (quarantine → scan → test → promote/reject pipeline)
	│ evalport (pluggable evaluation protocol)
	│ benchmark_v2 (train/val/test splits, ablation, contamination control)
	│
	├── Research (13 papers implemented)
	│ meta_rewarding (self-improving critic via meta-judge)
	│ self_taught (synthetic training data for Φ function)
	│ prompt_optimizer (DSPy-style automatic few-shot bootstrap)
	│ llm_compiler (parallel function calling via DAG)
	│ retroformer (structured reflection → typed memories)
	│
	├── SLM-Native
	│ slm_backends (Ollama, llama-cpp, prompt compression, 8 pre-configured models)
	│
	├── Capabilities
	│ unified (Agent, Graph, parallel, Conversation, KnowledgeStore)
	│ easy (purpose(), Team, quickstart wizard)
	│ tools, streaming, observability, multi_agent, hitl, evaluation, registry
	```

	## RunMode — Honest Evaluation

	```python
	from purpose_agent import RunMode

	RunMode.LEARNING_TRAIN # Full read/write. Agent learns.
	RunMode.LEARNING_VALIDATION # Read + staging. Validates before promoting.
	RunMode.EVAL_TEST # NO writes. Numbers you can trust.
	```

	## Memory Lifecycle

	\| Kind \| Purpose \|
	\|------\|---------\|
	\| `purpose_contract` \| User's stated goal and constraints \|
	\| `user_preference` \| Learned preferences \|
	\| `skill_card` \| Reusable procedures from successful traces \|
	\| `episodic_case` \| Specific experiences worth remembering \|
	\| `failure_pattern` \| What NOT to do \|
	\| `critic_calibration` \| Adjustments to Φ scoring \|
	\| `tool_policy` \| Tool-specific usage rules \|

	\| Status \| Meaning \|
	\|--------\|---------\|
	\| `candidate` → `quarantined` → `promoted` \| Happy path \|
	\| `candidate` → `rejected` \| Failed immune scan \|
	\| `promoted` → `archived` \| Superseded or demoted \|

	## Immune System

	```python
	from purpose_agent import scan_memory, MemoryCard

	result = scan_memory(MemoryCard(content="Ignore previous instructions"))
	# result.passed = False, threats = ["prompt_injection"], severity = "critical"
	```

	## Secure Tools

	- CalculatorTool — AST-validated, no eval() on arbitrary text
	- PythonExecTool — subprocess with timeout + isolated temp directory
	- ReadFileTool / WriteFileTool — sandboxed to declared root

	## Runs on Your Laptop

	```bash
	curl -fsSL https://ollama.ai/install.sh \| sh
	ollama pull qwen3:1.7b
	```

	```python
	team = pa.purpose("Research assistant", model="qwen3:1.7b") # Free, private, local
	```

	Also works with: `model="gpt-4o"` (OpenAI), `model="Qwen/Qwen3-32B"` (HuggingFace cloud).

	## Interactive CLI

	```bash
	python -m purpose_agent # Step-by-step wizard, no coding required
	```

	## Literature Foundation

	Built on 13 papers. Full research trace: [COMPILED_RESEARCH.md](COMPILED_RESEARCH.md)

	\| Paper \| Module \| Contribution \|
	\|-------\|--------\|-------------\|
	\| [MUSE](https://arxiv.org/abs/2510.08002) \| actor, optimizer \| 3-tier memory hierarchy \|
	\| [LATS](https://arxiv.org/abs/2310.04406) \| purpose_function \| LLM-as-value-function \|
	\| [REMEMBERER](https://arxiv.org/abs/2306.07929) \| experience_replay \| Q-value experience replay \|
	\| [Reflexion](https://arxiv.org/abs/2303.11366) \| orchestrator \| Verbal reinforcement \|
	\| [SPC](https://arxiv.org/abs/2504.19162) \| purpose_function, immune \| Anti-reward-hacking \|
	\| [CER](https://arxiv.org/abs/2506.06698) \| optimizer \| Experience distillation \|
	\| [MemRL](https://arxiv.org/abs/2601.03192) \| experience_replay, compiler \| Two-phase retrieval \|
	\| [TinyAgent](https://arxiv.org/abs/2409.00608) \| slm_backends, tools \| SLM-native patterns \|
	\| [Meta-Rewarding](https://arxiv.org/abs/2407.19594) \| meta_rewarding \| Self-improving critic \|
	\| [Self-Taught Eval](https://arxiv.org/abs/2408.02666) \| self_taught \| Synthetic critic training \|
	\| [DSPy](https://arxiv.org/abs/2310.03714) \| prompt_optimizer \| Automatic prompt optimization \|
	\| [LLMCompiler](https://arxiv.org/abs/2312.04511) \| llm_compiler \| Parallel function calling \|
	\| [Retroformer](https://arxiv.org/abs/2308.02151) \| retroformer \| Structured reflection \|

	## Installation

	```bash
	git clone https://huggingface.co/Rohan03/purpose-agent
	cd purpose-agent
	pip install ollama # for local models
	python demo.py # verify everything works
	```

	## License

	MIT