# 🤖 MCP-Agent-1.7B — Project Overview **Author:** Muhammad Talha **Goal:** Build a mini-Manus: a small language model fine-tuned for tool-calling, wrapped in an agent harness **Budget:** ~$3 (fits well under $10) **Status:** ✅ PLANNING COMPLETE — Waiting for your "START" signal --- ## 📚 What You'll Learn (A-to-Z) This project is designed to teach you every concept from the ground up. **Read these files in order** — each builds on the previous: | File | Topic | What You'll Learn | Read Time | |------|-------|-------------------|-----------| | `01-vision.md` | **The Vision** | What Manus is, what we're building, why it matters | 10 min | | `02-research.md` | **Research** | Papers we found, datasets discovered, what works | 10 min | | `03-architecture.md` | **Architecture** | ReAct loop, MCP protocol, agent harness design | 15 min | | `04-training.md` | **Training** | LoRA, SFT, hyperparameters, why each matters | 15 min | | `05-dataset.md` | **Dataset** | What data we have, quality issues, how to improve | 10 min | | `06-execution-plan.md` | **Execution** | Exact step-by-step plan when you say START | 10 min | | `07-tools-research.md` | **WOW Tools** | Browser automation, image gen, RAG, data analysis, etc. | 15 min | | `08-tool-ecosystem.md` | **Tool Ecosystem** | How to add ANY tool dynamically, no retraining | 15 min | | `GUIDE_A_TO_Z.md` | **Master Guide** | Complete reference combining all chapters | 30 min | **Total reading time:** ~130 minutes **Total build time:** ~5-6 hours **Total cost:** ~$1.50 --- ## 🎯 The Big Picture You asked: *"How does Manus do it, and how can we build something similar?"* ### What Is Manus? Manus (acquired by Meta) is an **AI agent** with three specialized sub-agents: 1. **Planner** — Breaks tasks into steps 2. **Executor** — Runs code, browses web, uses tools 3. **Verifier** — Checks results, fixes errors It runs in a cloud VM, works while you sleep, and can browse 50+ websites simultaneously. ### What We're Building: "Mini-Manus" We use **ONE model** (Qwen3-1.7B, 2B parameters) that plays all three roles: - We **fine-tune** it to natively understand tool-calling (MCP protocol) - We wrap it in a **ReAct loop** (think → act → observe → repeat) - We give it **real tools** it can execute (shell, files, Python, web search) - We build a **Gradio web app** around it **The magic:** The model doesn't call external MCP servers — it already KNOWS how to format tool calls because we trained it on 15,000 examples. ### Why People Will Say "WOW" 1. **Runs locally** — No API costs, no rate limits 2. **Actually DOES things** — Not just chat, but real shell commands and file operations 3. **100× smaller than Manus's models** — 1.7B vs 100B+ parameters 4. **Costs $3** — Not thousands 5. **YOU built it** — From research → data → training → app --- ## 💰 Budget Breakdown | Item | Cost | Why | |------|------|-----| | Training (T4 GPU, ~2h) | ~$1.20 | Fine-tuning with LoRA | | Inference testing | ~$0.30 | Testing the model | | Gradio Space (Zero GPU) | $0 | Free tier | | Contingency | ~$0.50 | Buffer for retries | | **Total** | **~$2** | Well under $10! ✅ | --- ## 🔬 Research Highlights (From Our Deep Dive) ### Papers That Back Our Approach | Paper | Key Finding | How We Use It | |-------|-------------|---------------| | **TinyAgent** (arXiv:2409.00608) | 1.1B model ≈ GPT-4 at tool-calling | Proves small models work | | **STAR** (arXiv:2602.03022) | Qwen3-1.7B beats Llama-3.1-8B | Chose Qwen3 as base | | **Agent-World** (arXiv:2604.18292) | MCP-based training environments | MCP is the right protocol | | **LoRA Without Regret** (2025) | all-linear LoRA = full fine-tuning | Using `target_modules="all-linear"` | ### Datasets We Discovered - **glaiveai/glaive-function-calling-v2** — 100K examples, most popular - **Salesforce/xlam-function-calling** — 60K diverse examples - **Our dataset** — 16K examples, already prepared, needs some improvements --- ## 📖 Reading Guide ### Start Here: 01-vision.md Understand WHAT we're building and WHY. This answers your core question: *"How does Manus work and what are we replicating?"* ### Then: 02-research.md See the papers we found and WHY we made our choices. This teaches you *how to do research* for any ML project. ### Then: 03-architecture.md Learn HOW the agent harness works — the ReAct loop, MCP protocol, tool registry, and how Manus's multi-agent design compares to our simpler approach. ### Then: 04-training.md Understand HOW we train the model — LoRA, SFT, cross-entropy loss, backpropagation, and what each hyperparameter controls. This is the deepest technical chapter. ### Then: 05-dataset.md Review our training data — what's good, what's missing, and how we'd improve it. This teaches you data quality assessment. ### Then: 06-execution-plan.md See the EXACT step-by-step plan with timelines, costs, and decision points. This is our "project management" document. ### Then: 07-tools-research.md Discover the 12+ tools we can add — browser automation, image generation, RAG, data analysis, and more. Ranked by wow factor and feasibility. ### Then: 08-tool-ecosystem.md Learn how to add ANY tool dynamically without retraining. The `@tool` decorator, MCP servers, and the tool marketplace concept. ### Finally: GUIDE_A_TO_Z.md The master reference combining all chapters into one document. Use this as a quick reference after reading the individual chapters. --- ## 🚀 When You're Ready When you've read all the files and feel confident, just say: > **"START"** And we'll begin building. Every step will be explained as we do it. --- ## 📁 File Structure ``` /project/ ├── 00-README.md ← You are here ├── 01-vision.md ← The Vision & Manus comparison ├── 02-research.md ← Papers, datasets & findings ├── 03-architecture.md ← Agent harness & MCP protocol ├── 04-training.md ← LoRA, SFT & hyperparameters ├── 05-dataset.md ← Dataset analysis & improvements ├── 06-execution-plan.md ← Step-by-step build plan ├── 07-tools-research.md ← WOW tools: browser, RAG, image gen, etc. ├── 08-tool-ecosystem.md ← How to add ANY tool dynamically ├── GUIDE_A_TO_Z.md ← Master guide combining all chapters ├── train.py ← Training script (generated when you say START) ├── agent_app.py ← Gradio app (generated when you say START) └── datasets/ ← Training data & related files └── mcp-agent-training-data/ ``` --- *Learning ML by building real things — one step at a time.* *Built by Muhammad Talha*