purpose-agent / README.md
Rohan03's picture
v0.2.0: Complete README with SLM-native, multi-agent, HITL, eval, streaming, tools, observability
adb4257 verified
|
raw
history blame
9.27 kB
metadata
library_name: purpose-agent
license: mit
language:
  - en
tags:
  - reinforcement-learning
  - agents
  - self-improving
  - experience-replay
  - llm-as-judge
  - state-value-evaluation
  - memory-augmented
  - react
  - orchestration
  - modular
  - slm
  - small-language-models
  - multi-agent
  - human-in-the-loop
  - streaming
  - tools
  - evaluation
  - ollama
  - local-models
pipeline_tag: text-generation

Purpose Agent v0.2.0

The world's first SLM-native self-improving agentic framework.

Works with both Small Language Models (0.6B–3B params, local, $0 cost) and Large Language Models (cloud APIs) with equal efficiency. Agents learn from experience via a Purpose Function Ξ¦(s) β€” no fine-tuning needed.

What Makes This Different

Feature Purpose Agent LangChain LangGraph CrewAI AutoGen smolagents
Self-Improvement βœ… Ξ¦(s) + experience replay + heuristic distillation ❌ ❌ ❌ ❌ ❌
SLM-Native βœ… Grammar-constrained JSON, prompt compression, Tool RAG ❌ ❌ ❌ ❌ ⚠️
Anti-Reward-Hacking βœ… 7 strict rules + cache consistency + anomaly detection ❌ ❌ ❌ ❌ ❌
3-Tier Memory βœ… Strategic/Procedural/Tool with Q-value retrieval ❌ ⚠️ ⚠️ ❌ ❌
Multi-Agent with Shared Learning βœ… Agents learn from each other ❌ ⚠️ βœ… βœ… ⚠️
Human Ξ¦ Override βœ… Humans teach the critic β†’ permanent learning ❌ ⚠️ ❌ ❌ ❌
Streaming βœ… Event + token streaming βœ… βœ… ⚠️ ⚠️ βœ…
Tool Framework βœ… Schema, validation, retry, Tool RAG βœ… βœ… βœ… βœ… βœ…
Cost Tracking βœ… Per-call token + USD tracking ⚠️ ⚠️ ❌ ❌ ❌
Benchmark Harness βœ… Improvement curve tracking ❌ ❌ ❌ ❌ ❌
Lightweight βœ… ~150KB, stdlib only ❌ ❌ ⚠️ ⚠️ βœ…
Literature-Grounded βœ… 8 papers implemented ❌ ❌ ❌ ❌ ❌

Architecture

purpose_agent/
β”œβ”€β”€ types.py              # Core data types
β”œβ”€β”€ llm_backend.py        # Cloud LLM backends (HF, OpenAI, Mock)
β”œβ”€β”€ slm_backends.py       # πŸ†• SLM backends (Ollama, llama-cpp, prompt compression)
β”œβ”€β”€ actor.py              # ReAct agent with 3-tier memory
β”œβ”€β”€ purpose_function.py   # Non-hackable Ξ¦(s) critic
β”œβ”€β”€ experience_replay.py  # Two-phase retrieval (similarity + Q-value)
β”œβ”€β”€ optimizer.py          # Trajectory β†’ heuristic distillation
β”œβ”€β”€ orchestrator.py       # Main loop
β”œβ”€β”€ streaming.py          # πŸ†• Async engine + event streaming
β”œβ”€β”€ tools.py              # πŸ†• Tool framework + built-in tools + Tool RAG
β”œβ”€β”€ observability.py      # πŸ†• Cost tracking, callbacks, metrics
β”œβ”€β”€ multi_agent.py        # πŸ†• Agent teams with shared learning
β”œβ”€β”€ hitl.py               # πŸ†• Human-in-the-loop + checkpointing
└── evaluation.py         # πŸ†• Benchmark runner + improvement curves

Quick Start β€” Local SLM (Zero Cost)

# 1. Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# 2. Pull a small model (1.7B params, runs on any laptop)
ollama pull qwen3:1.7b

# 3. Run your agent
python my_agent.py
from purpose_agent import (
    Orchestrator, OllamaBackend, State, Environment, Action,
    CalculatorTool, ToolRegistry,
)

# SLM backend β€” runs locally, zero cost
llm = OllamaBackend(model="qwen3:1.7b")   # 1.7B params

# Or use a cloud LLM
# from purpose_agent import HFInferenceBackend
# llm = HFInferenceBackend(model_id="Qwen/Qwen3-32B", provider="cerebras")

class MyEnv(Environment):
    def execute(self, action, state):
        return State(data={"result": "done"})

orch = Orchestrator(llm=llm, environment=MyEnv())
result = orch.run_task(purpose="Solve the problem", max_steps=10)
print(result.summary())

SLM Model Registry

Pre-configured models optimized for agent tasks:

from purpose_agent import create_slm_backend

backend = create_slm_backend("phi-4-mini")    # 3.8B β€” best tool-use accuracy
backend = create_slm_backend("qwen3-1.7b")    # 1.7B β€” best balance
backend = create_slm_backend("qwen3-0.6b")    # 0.6B β€” ultra-light
backend = create_slm_backend("llama-3.2-1b")  # 1B β€” 128K context
backend = create_slm_backend("smollm2-1.7b")  # 1.7B β€” HF native

Multi-Agent with Shared Learning

Agents learn from each other β€” when one agent solves a problem, all benefit:

from purpose_agent import AgentSpec, AgentTeam, OllamaBackend

researcher = AgentSpec(
    name="researcher", role="Find information",
    model=OllamaBackend(model="qwen3:1.7b"),     # Cheap SLM
    expertise_keywords=["search", "find", "research"],
)
coder = AgentSpec(
    name="coder", role="Write and debug code",
    model=OllamaBackend(model="phi4-mini"),       # Better SLM for code
    expertise_keywords=["code", "program", "debug"],
)

team = AgentTeam(
    agents=[researcher, coder],
    default_model=OllamaBackend(model="qwen3:1.7b"),
    environment=my_env,
)

# Auto-delegates to the best agent
result = team.run_task(purpose="Search for Python sorting algorithms")
print(team.get_learning_report())  # See shared knowledge

Human-in-the-Loop

Humans can override Ξ¦ scores β†’ the agent permanently learns preferences:

from purpose_agent import HITLOrchestrator, CLIInputHandler

hitl = HITLOrchestrator(
    orchestrator=orch,
    input_handler=CLIInputHandler(),
    approve_actions=True,      # Approve each action
    review_scores=True,        # Override Ξ¦ scores
    checkpoint_dir="./checkpoints",
)
result = hitl.run_task(purpose="Important task")

# Inject knowledge directly
hitl.inject_heuristic(
    pattern="When facing {problem_type}",
    strategy="Always try the simplest approach first",
)

Streaming

Real-time event streaming for UIs:

import asyncio
from purpose_agent import AsyncOrchestrator

async def main():
    async_orch = AsyncOrchestrator(orch)
    async for event in async_orch.run_task_stream(purpose="..."):
        if event.event_type == "action":
            print(f"πŸ€– {event.data['name']}: {event.data['thought'][:100]}")
        elif event.event_type == "score":
            print(f"πŸ“Š Ξ¦: {event.data['phi_before']:.1f} β†’ {event.data['phi_after']:.1f}")

asyncio.run(main())

Tool Framework

from purpose_agent import FunctionTool, ToolRegistry, CalculatorTool, PythonExecTool

# Create tool from any function
@FunctionTool.from_function
def search(query: str) -> str:
    """Search the web for information."""
    return requests.get(f"https://api.search.com?q={query}").text

# Tool RAG for SLMs (only load relevant tools into prompt)
registry = ToolRegistry()
registry.register(CalculatorTool())
registry.register(PythonExecTool())
registry.register(search)

relevant = registry.get_relevant_tools("compute 2+2", top_k=2)
# β†’ [CalculatorTool, PythonExecTool]  (search excluded β€” saves tokens)

Cost Tracking

from purpose_agent import CostTracker

tracker = CostTracker(model_name="qwen3:1.7b", cost_per_1m_input=0.005)
tracker.record(prompt_tokens=500, completion_tokens=200)
print(tracker.summary())
# β†’ {'model': 'qwen3:1.7b', 'total_tokens': 700, 'estimated_cost_usd': 0.000005}

Benchmark & Prove Self-Improvement

from purpose_agent import BenchmarkRunner, BenchmarkTask

runner = BenchmarkRunner(orchestrator=orch)
tasks = [
    BenchmarkTask(id="t1", purpose="Find treasure", initial_state=...),
    BenchmarkTask(id="t2", purpose="Solve puzzle", initial_state=...),
]

result = runner.run(tasks, iterations=10, name="MazeTest")
print(result.summary())
# Iteration    Success Rate      Avg Ξ¦    Avg Steps   Avg Reward
# -----------------------------------------------------------------
#          1          40.0%       4.20          12.0         3.20
#          5          70.0%       6.80           8.0         6.50
#         10          90.0%       8.50           6.0         8.90
# Improvement: 40.0% β†’ 90.0% (+50.0%)

result.save("results/benchmark.json")

Literature Foundation

Paper What it contributes
MUSE 3-tier memory (strategic/procedural/tool)
LATS LLM-as-value-function V(s)
REMEMBERER Q-value experience replay
Reflexion Verbal reinforcement
SPC Anti-reward-hacking
CER Contextual experience distillation
MemRL Two-phase retrieval
TinyAgent SLM-native agent patterns

Installation

# Core (no dependencies beyond stdlib)
git clone https://huggingface.co/Rohan03/purpose-agent
cd purpose-agent

# For local SLMs
pip install ollama

# For cloud LLMs
pip install huggingface_hub  # or: pip install openai

# Run demo (no API keys needed)
python demo.py

License

MIT