# 01 — The Vision: What We're Building & Why

## 🎯 Your Question

> "Manus is amazing. How do they do it? Can we build something like that?"

**Short answer:** Yes! Not identical — Manus has hundreds of engineers and millions in funding. But we can build a **"child version"** that captures the core idea and teaches you every concept along the way.

---

## 🤖 What Is Manus AI?

Manus (acquired by Meta) is an **AI agent** — not just a chatbot. Here's what makes it special:

### 1. It Actually DOES Things (Not Just Talks)

| ChatGPT/Claude | Manus |
|---------------|-------|
| "Here's how to find Python files..." | *Actually runs the command and shows you* |
| "Here's a script idea..." | *Writes, tests, and deploys the code* |
| "I can help you plan..." | *Plans, executes, and verifies* |

### 2. Three Specialized Agents Working Together

Manus uses **three sub-agents** that coordinate:

```
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  PLANNER    │────▶│  EXECUTOR   │────▶│  VERIFIER   │
│             │     │             │     │             │
│ "Break this │     │ "Run shell  │     │ "Check if   │
│  into steps"│     │  commands"  │     │  it worked"  │
│             │     │             │     │             │
│  Strategize  │     │  Navigate   │     │  Quality    │
│  multi-step  │     │  web, write │     │  control    │
│  path        │     │  code, use  │     │  & fix      │
│              │     │  tools      │     │  errors     │
└─────────────┘     └─────────────┘     └─────────────┘
```

### 3. Persistent Cloud Environment

Manus runs in a **cloud VM** (virtual machine):
- Files persist between sessions
- Can install software (`pip install`, `npm install`)
- Works while you sleep (asynchronous)

### 4. Can Browse 50+ Websites Simultaneously

For research tasks, Manus spawns many parallel agents to gather info.

---

## 🔬 What We're Building: "Mini-Manus"

### Our Simpler Architecture

Instead of three separate agents + cloud VM, we use **ONE model** with a loop:

```
User: "Find all Python files and count them"
  │
  ▼
┌─────────────────────────────────────────┐
│         MCP-Agent-1.7B (Our Model)        │
│                                         │
│  ┌─── THINK ───┐                       │
│  │ "I need to   │                       │
│  │  list .py    │                       │
│  │  files"      │                       │
│  └──────┬───────┘                       │
│         │                               │
│  ┌─── ACT ─────┐                       │
│  │ shell_exec({│  ◀── ONE MODEL plays   │
│  │  "command": │      ALL three roles   │
│  │  "find .    │      (planner +        │
│  │   -name     │       executor +       │
│  │   '*.py'"   │       verifier)        │
│  │ })          │                       │
│  └──────┬───────┘                       │
│         │                               │
│  ▼ (Result: "main.py, test.py, utils.py")
│                                         │
│  ┌─── VERIFY ──┐                       │
│  │ "Got 3      │                       │
│  │  files. Now │                       │
│  │  count."   │                       │
│  └──────┬───────┘                       │
│         │                               │
│  ┌─── ACT ─────┐                       │
│  │ python_exec({│                      │
│  │  "code":    │                       │
│  │  "print(3)"│                       │
│  │ })         │                       │
│  └──────┬───────┘                       │
│         │                               │
│  ▼ (Result: "3")                        │
│                                         │
│  ┌── RESPOND ──┐                       │
│  │ "Found 3    │                       │
│  │  Python     │                       │
│  │  files! ✅" │                       │
│  └─────────────┘                       │
└─────────────────────────────────────────┘
```

### Key Differences from Manus

| Feature | Manus | Mini-Manus (Ours) |
|---------|-------|-------------------|
| Agents | 3 specialized (Planner/Executor/Verifier) | 1 model, all roles |
| Environment | Cloud VM | Local/Gradio Space |
| Parallelism | 50+ simultaneous | Sequential (one at a time) |
| Cost | $$$/month | $3 one-time |
| Model Size | GPT-4 class (100B+) | 1.7B (100× smaller!) |
| Persistence | Files persist forever | Session-based |
| Web Browsing | Real browser | DuckDuckGo search API |

### Why This Still Impresses People

1. **It runs LOCALLY** — No API keys, no cloud costs, no rate limits
2. **It actually DOES things** — Not just text, but real shell commands, file operations, Python execution
3. **It's 100× smaller** than Manus's models but still functional
4. **It's OPEN SOURCE** — Anyone can use, modify, improve it
5. **YOU trained it** — From base model to agent in one project

---

## 🧠 The Core Insight: Why Small Models CAN Work for Agents

You might think: *"How can a 1.7B model compete with GPT-4?"*

The secret is **FOCUS**.

GPT-4 is a generalist — it knows about history, science, poetry, coding, everything.
Our model is a **specialist** — it ONLY knows about tool-calling.

Think of it like this:
- GPT-4 = A professor who can teach any subject
- Our model = A skilled technician who only knows how to use tools

The **TinyAgent paper** proved this: a 1.1B model fine-tuned on tool-calling 
data matched GPT-4-Turbo at function-calling tasks. Not because it's smarter, 
but because it's **focused**.

---

## 📋 What Makes This a "WOW" Project

When you show this to people, they'll be impressed because:

### 1. "You trained your own AI agent?"
Most people think you need a PhD and a supercomputer. You don't.

### 2. "It runs on a laptop?"
1.7B parameters = 4GB in memory. Runs on any gaming laptop.

### 3. "It can actually modify files?"
Not just text generation — real file system operations, shell commands, Python execution.

### 4. "It costs $3?"
Compared to Manus's pricing (or OpenAI API costs), this is almost free.

### 5. "You built this yourself?"
From research → data → training → app. Full pipeline.

---

## 🎓 What You'll Learn From This Project

By the end, you'll understand:
- ✅ How AI agents work (ReAct pattern)
- ✅ What MCP is and why it matters
- ✅ How to pick base models for different budgets
- ✅ LoRA: the magic of cheap fine-tuning
- ✅ SFT: supervised fine-tuning step-by-step
- ✅ How to tune hyperparameters (learning rate, batch size, epochs)
- ✅ How to build an agent harness
- ✅ How to deploy ML models
- ✅ How to read research papers and apply them

**If you can train a 1.7B model, you can train a 70B model.**
The concepts are identical — only the scale changes.

---

## 🔜 Next Step

Read `02-research.md` to see what papers and datasets we found, and why we made the choices we did.