# 🤖 MCP-Agent-1.7B — Project Overview

**Author:** Muhammad Talha  
**Goal:** Build a mini-Manus: a small language model fine-tuned for tool-calling, wrapped in an agent harness  
**Budget:** ~$3 (fits well under $10)  
**Status:** ✅ PLANNING COMPLETE — Waiting for your "START" signal

---

## 📚 What You'll Learn (A-to-Z)

This project is designed to teach you every concept from the ground up.
**Read these files in order** — each builds on the previous:

| File | Topic | What You'll Learn | Read Time |
|------|-------|-------------------|-----------|
| `01-vision.md` | **The Vision** | What Manus is, what we're building, why it matters | 10 min |
| `02-research.md` | **Research** | Papers we found, datasets discovered, what works | 10 min |
| `03-architecture.md` | **Architecture** | ReAct loop, MCP protocol, agent harness design | 15 min |
| `04-training.md` | **Training** | LoRA, SFT, hyperparameters, why each matters | 15 min |
| `05-dataset.md` | **Dataset** | What data we have, quality issues, how to improve | 10 min |
| `06-execution-plan.md` | **Execution** | Exact step-by-step plan when you say START | 10 min |
| `07-tools-research.md` | **WOW Tools** | Browser automation, image gen, RAG, data analysis, etc. | 15 min |
| `08-tool-ecosystem.md` | **Tool Ecosystem** | How to add ANY tool dynamically, no retraining | 15 min |
| `GUIDE_A_TO_Z.md` | **Master Guide** | Complete reference combining all chapters | 30 min |

**Total reading time:** ~130 minutes  
**Total build time:** ~5-6 hours  
**Total cost:** ~$1.50

---

## 🎯 The Big Picture

You asked: *"How does Manus do it, and how can we build something similar?"*

### What Is Manus?
Manus (acquired by Meta) is an **AI agent** with three specialized sub-agents:
1. **Planner** — Breaks tasks into steps
2. **Executor** — Runs code, browses web, uses tools
3. **Verifier** — Checks results, fixes errors

It runs in a cloud VM, works while you sleep, and can browse 50+ websites simultaneously.

### What We're Building: "Mini-Manus"
We use **ONE model** (Qwen3-1.7B, 2B parameters) that plays all three roles:
- We **fine-tune** it to natively understand tool-calling (MCP protocol)
- We wrap it in a **ReAct loop** (think → act → observe → repeat)
- We give it **real tools** it can execute (shell, files, Python, web search)
- We build a **Gradio web app** around it

**The magic:** The model doesn't call external MCP servers — it already KNOWS 
how to format tool calls because we trained it on 15,000 examples.

### Why People Will Say "WOW"
1. **Runs locally** — No API costs, no rate limits
2. **Actually DOES things** — Not just chat, but real shell commands and file operations
3. **100× smaller than Manus's models** — 1.7B vs 100B+ parameters
4. **Costs $3** — Not thousands
5. **YOU built it** — From research → data → training → app

---

## 💰 Budget Breakdown

| Item | Cost | Why |
|------|------|-----|
| Training (T4 GPU, ~2h) | ~$1.20 | Fine-tuning with LoRA |
| Inference testing | ~$0.30 | Testing the model |
| Gradio Space (Zero GPU) | $0 | Free tier |
| Contingency | ~$0.50 | Buffer for retries |
| **Total** | **~$2** | Well under $10! ✅ |

---

## 🔬 Research Highlights (From Our Deep Dive)

### Papers That Back Our Approach

| Paper | Key Finding | How We Use It |
|-------|-------------|---------------|
| **TinyAgent** (arXiv:2409.00608) | 1.1B model ≈ GPT-4 at tool-calling | Proves small models work |
| **STAR** (arXiv:2602.03022) | Qwen3-1.7B beats Llama-3.1-8B | Chose Qwen3 as base |
| **Agent-World** (arXiv:2604.18292) | MCP-based training environments | MCP is the right protocol |
| **LoRA Without Regret** (2025) | all-linear LoRA = full fine-tuning | Using `target_modules="all-linear"` |

### Datasets We Discovered
- **glaiveai/glaive-function-calling-v2** — 100K examples, most popular
- **Salesforce/xlam-function-calling** — 60K diverse examples
- **Our dataset** — 16K examples, already prepared, needs some improvements

---

## 📖 Reading Guide

### Start Here: 01-vision.md
Understand WHAT we're building and WHY. This answers your core question: 
*"How does Manus work and what are we replicating?"*

### Then: 02-research.md
See the papers we found and WHY we made our choices. This teaches you 
*how to do research* for any ML project.

### Then: 03-architecture.md
Learn HOW the agent harness works — the ReAct loop, MCP protocol, tool registry, 
and how Manus's multi-agent design compares to our simpler approach.

### Then: 04-training.md
Understand HOW we train the model — LoRA, SFT, cross-entropy loss, backpropagation, 
and what each hyperparameter controls. This is the deepest technical chapter.

### Then: 05-dataset.md
Review our training data — what's good, what's missing, and how we'd improve it. 
This teaches you data quality assessment.

### Then: 06-execution-plan.md
See the EXACT step-by-step plan with timelines, costs, and decision points. 
This is our "project management" document.

### Then: 07-tools-research.md
Discover the 12+ tools we can add — browser automation, image generation, RAG, 
data analysis, and more. Ranked by wow factor and feasibility.

### Then: 08-tool-ecosystem.md
Learn how to add ANY tool dynamically without retraining. The `@tool` decorator, 
MCP servers, and the tool marketplace concept.

### Finally: GUIDE_A_TO_Z.md
The master reference combining all chapters into one document. Use this as a 
quick reference after reading the individual chapters.

---

## 🚀 When You're Ready

When you've read all the files and feel confident, just say:

> **"START"**

And we'll begin building. Every step will be explained as we do it.

---

## 📁 File Structure

```
/project/
├── 00-README.md           ← You are here
├── 01-vision.md           ← The Vision & Manus comparison
├── 02-research.md         ← Papers, datasets & findings
├── 03-architecture.md     ← Agent harness & MCP protocol
├── 04-training.md         ← LoRA, SFT & hyperparameters
├── 05-dataset.md          ← Dataset analysis & improvements
├── 06-execution-plan.md   ← Step-by-step build plan
├── 07-tools-research.md   ← WOW tools: browser, RAG, image gen, etc.
├── 08-tool-ecosystem.md   ← How to add ANY tool dynamically
├── GUIDE_A_TO_Z.md        ← Master guide combining all chapters
├── train.py               ← Training script (generated when you say START)
├── agent_app.py           ← Gradio app (generated when you say START)
└── datasets/              ← Training data & related files
    └── mcp-agent-training-data/
```

---

*Learning ML by building real things — one step at a time.*
*Built by Muhammad Talha*