MCP-Agent-1.7B / docs /00-README.md
muhammadtlha944's picture
Upload docs/00-README.md
a0855be verified
# πŸ€– MCP-Agent-1.7B β€” Project Overview
**Author:** Muhammad Talha
**Goal:** Build a mini-Manus: a small language model fine-tuned for tool-calling, wrapped in an agent harness
**Budget:** ~$3 (fits well under $10)
**Status:** βœ… PLANNING COMPLETE β€” Waiting for your "START" signal
---
## πŸ“š What You'll Learn (A-to-Z)
This project is designed to teach you every concept from the ground up.
**Read these files in order** β€” each builds on the previous:
| File | Topic | What You'll Learn | Read Time |
|------|-------|-------------------|-----------|
| `01-vision.md` | **The Vision** | What Manus is, what we're building, why it matters | 10 min |
| `02-research.md` | **Research** | Papers we found, datasets discovered, what works | 10 min |
| `03-architecture.md` | **Architecture** | ReAct loop, MCP protocol, agent harness design | 15 min |
| `04-training.md` | **Training** | LoRA, SFT, hyperparameters, why each matters | 15 min |
| `05-dataset.md` | **Dataset** | What data we have, quality issues, how to improve | 10 min |
| `06-execution-plan.md` | **Execution** | Exact step-by-step plan when you say START | 10 min |
| `07-tools-research.md` | **WOW Tools** | Browser automation, image gen, RAG, data analysis, etc. | 15 min |
| `08-tool-ecosystem.md` | **Tool Ecosystem** | How to add ANY tool dynamically, no retraining | 15 min |
| `GUIDE_A_TO_Z.md` | **Master Guide** | Complete reference combining all chapters | 30 min |
**Total reading time:** ~130 minutes
**Total build time:** ~5-6 hours
**Total cost:** ~$1.50
---
## 🎯 The Big Picture
You asked: *"How does Manus do it, and how can we build something similar?"*
### What Is Manus?
Manus (acquired by Meta) is an **AI agent** with three specialized sub-agents:
1. **Planner** β€” Breaks tasks into steps
2. **Executor** β€” Runs code, browses web, uses tools
3. **Verifier** β€” Checks results, fixes errors
It runs in a cloud VM, works while you sleep, and can browse 50+ websites simultaneously.
### What We're Building: "Mini-Manus"
We use **ONE model** (Qwen3-1.7B, 2B parameters) that plays all three roles:
- We **fine-tune** it to natively understand tool-calling (MCP protocol)
- We wrap it in a **ReAct loop** (think β†’ act β†’ observe β†’ repeat)
- We give it **real tools** it can execute (shell, files, Python, web search)
- We build a **Gradio web app** around it
**The magic:** The model doesn't call external MCP servers β€” it already KNOWS
how to format tool calls because we trained it on 15,000 examples.
### Why People Will Say "WOW"
1. **Runs locally** β€” No API costs, no rate limits
2. **Actually DOES things** β€” Not just chat, but real shell commands and file operations
3. **100Γ— smaller than Manus's models** β€” 1.7B vs 100B+ parameters
4. **Costs $3** β€” Not thousands
5. **YOU built it** β€” From research β†’ data β†’ training β†’ app
---
## πŸ’° Budget Breakdown
| Item | Cost | Why |
|------|------|-----|
| Training (T4 GPU, ~2h) | ~$1.20 | Fine-tuning with LoRA |
| Inference testing | ~$0.30 | Testing the model |
| Gradio Space (Zero GPU) | $0 | Free tier |
| Contingency | ~$0.50 | Buffer for retries |
| **Total** | **~$2** | Well under $10! βœ… |
---
## πŸ”¬ Research Highlights (From Our Deep Dive)
### Papers That Back Our Approach
| Paper | Key Finding | How We Use It |
|-------|-------------|---------------|
| **TinyAgent** (arXiv:2409.00608) | 1.1B model β‰ˆ GPT-4 at tool-calling | Proves small models work |
| **STAR** (arXiv:2602.03022) | Qwen3-1.7B beats Llama-3.1-8B | Chose Qwen3 as base |
| **Agent-World** (arXiv:2604.18292) | MCP-based training environments | MCP is the right protocol |
| **LoRA Without Regret** (2025) | all-linear LoRA = full fine-tuning | Using `target_modules="all-linear"` |
### Datasets We Discovered
- **glaiveai/glaive-function-calling-v2** β€” 100K examples, most popular
- **Salesforce/xlam-function-calling** β€” 60K diverse examples
- **Our dataset** β€” 16K examples, already prepared, needs some improvements
---
## πŸ“– Reading Guide
### Start Here: 01-vision.md
Understand WHAT we're building and WHY. This answers your core question:
*"How does Manus work and what are we replicating?"*
### Then: 02-research.md
See the papers we found and WHY we made our choices. This teaches you
*how to do research* for any ML project.
### Then: 03-architecture.md
Learn HOW the agent harness works β€” the ReAct loop, MCP protocol, tool registry,
and how Manus's multi-agent design compares to our simpler approach.
### Then: 04-training.md
Understand HOW we train the model β€” LoRA, SFT, cross-entropy loss, backpropagation,
and what each hyperparameter controls. This is the deepest technical chapter.
### Then: 05-dataset.md
Review our training data β€” what's good, what's missing, and how we'd improve it.
This teaches you data quality assessment.
### Then: 06-execution-plan.md
See the EXACT step-by-step plan with timelines, costs, and decision points.
This is our "project management" document.
### Then: 07-tools-research.md
Discover the 12+ tools we can add β€” browser automation, image generation, RAG,
data analysis, and more. Ranked by wow factor and feasibility.
### Then: 08-tool-ecosystem.md
Learn how to add ANY tool dynamically without retraining. The `@tool` decorator,
MCP servers, and the tool marketplace concept.
### Finally: GUIDE_A_TO_Z.md
The master reference combining all chapters into one document. Use this as a
quick reference after reading the individual chapters.
---
## πŸš€ When You're Ready
When you've read all the files and feel confident, just say:
> **"START"**
And we'll begin building. Every step will be explained as we do it.
---
## πŸ“ File Structure
```
/project/
β”œβ”€β”€ 00-README.md ← You are here
β”œβ”€β”€ 01-vision.md ← The Vision & Manus comparison
β”œβ”€β”€ 02-research.md ← Papers, datasets & findings
β”œβ”€β”€ 03-architecture.md ← Agent harness & MCP protocol
β”œβ”€β”€ 04-training.md ← LoRA, SFT & hyperparameters
β”œβ”€β”€ 05-dataset.md ← Dataset analysis & improvements
β”œβ”€β”€ 06-execution-plan.md ← Step-by-step build plan
β”œβ”€β”€ 07-tools-research.md ← WOW tools: browser, RAG, image gen, etc.
β”œβ”€β”€ 08-tool-ecosystem.md ← How to add ANY tool dynamically
β”œβ”€β”€ GUIDE_A_TO_Z.md ← Master guide combining all chapters
β”œβ”€β”€ train.py ← Training script (generated when you say START)
β”œβ”€β”€ agent_app.py ← Gradio app (generated when you say START)
└── datasets/ ← Training data & related files
└── mcp-agent-training-data/
```
---
*Learning ML by building real things β€” one step at a time.*
*Built by Muhammad Talha*