Upload docs/00-README.md

a0855be verified 11 days ago

6.77 kB

	# 🤖 MCP-Agent-1.7B — Project Overview

	Author: Muhammad Talha
	Goal: Build a mini-Manus: a small language model fine-tuned for tool-calling, wrapped in an agent harness
	Budget: ~$3 (fits well under $10)
	Status: ✅ PLANNING COMPLETE — Waiting for your "START" signal

	---

	## 📚 What You'll Learn (A-to-Z)

	This project is designed to teach you every concept from the ground up.
	Read these files in order — each builds on the previous:

	\| File \| Topic \| What You'll Learn \| Read Time \|
	\|------\|-------\|-------------------\|-----------\|
	\| `01-vision.md` \| The Vision \| What Manus is, what we're building, why it matters \| 10 min \|
	\| `02-research.md` \| Research \| Papers we found, datasets discovered, what works \| 10 min \|
	\| `03-architecture.md` \| Architecture \| ReAct loop, MCP protocol, agent harness design \| 15 min \|
	\| `04-training.md` \| Training \| LoRA, SFT, hyperparameters, why each matters \| 15 min \|
	\| `05-dataset.md` \| Dataset \| What data we have, quality issues, how to improve \| 10 min \|
	\| `06-execution-plan.md` \| Execution \| Exact step-by-step plan when you say START \| 10 min \|
	\| `07-tools-research.md` \| WOW Tools \| Browser automation, image gen, RAG, data analysis, etc. \| 15 min \|
	\| `08-tool-ecosystem.md` \| Tool Ecosystem \| How to add ANY tool dynamically, no retraining \| 15 min \|
	\| `GUIDE_A_TO_Z.md` \| Master Guide \| Complete reference combining all chapters \| 30 min \|

	Total reading time: ~130 minutes
	Total build time: ~5-6 hours
	Total cost: ~$1.50

	---

	## 🎯 The Big Picture

	You asked: "How does Manus do it, and how can we build something similar?"

	### What Is Manus?
	Manus (acquired by Meta) is an AI agent with three specialized sub-agents:
	1. Planner — Breaks tasks into steps
	2. Executor — Runs code, browses web, uses tools
	3. Verifier — Checks results, fixes errors

	It runs in a cloud VM, works while you sleep, and can browse 50+ websites simultaneously.

	### What We're Building: "Mini-Manus"
	We use ONE model (Qwen3-1.7B, 2B parameters) that plays all three roles:
	- We fine-tune it to natively understand tool-calling (MCP protocol)
	- We wrap it in a ReAct loop (think → act → observe → repeat)
	- We give it real tools it can execute (shell, files, Python, web search)
	- We build a Gradio web app around it

	The magic: The model doesn't call external MCP servers — it already KNOWS
	how to format tool calls because we trained it on 15,000 examples.

	### Why People Will Say "WOW"
	1. Runs locally — No API costs, no rate limits
	2. Actually DOES things — Not just chat, but real shell commands and file operations
	3. 100× smaller than Manus's models — 1.7B vs 100B+ parameters
	4. Costs $3 — Not thousands
	5. YOU built it — From research → data → training → app

	---

	## 💰 Budget Breakdown

	\| Item \| Cost \| Why \|
	\|------\|------\|-----\|
	\| Training (T4 GPU, ~2h) \| ~$1.20 \| Fine-tuning with LoRA \|
	\| Inference testing \| ~$0.30 \| Testing the model \|
	\| Gradio Space (Zero GPU) \| $0 \| Free tier \|
	\| Contingency \| ~$0.50 \| Buffer for retries \|
	\| Total \| ~$2 \| Well under $10! ✅ \|

	---

	## 🔬 Research Highlights (From Our Deep Dive)

	### Papers That Back Our Approach

	\| Paper \| Key Finding \| How We Use It \|
	\|-------\|-------------\|---------------\|
	\| TinyAgent (arXiv:2409.00608) \| 1.1B model ≈ GPT-4 at tool-calling \| Proves small models work \|
	\| STAR (arXiv:2602.03022) \| Qwen3-1.7B beats Llama-3.1-8B \| Chose Qwen3 as base \|
	\| Agent-World (arXiv:2604.18292) \| MCP-based training environments \| MCP is the right protocol \|
	\| LoRA Without Regret (2025) \| all-linear LoRA = full fine-tuning \| Using `target_modules="all-linear"` \|

	### Datasets We Discovered
	- glaiveai/glaive-function-calling-v2 — 100K examples, most popular
	- Salesforce/xlam-function-calling — 60K diverse examples
	- Our dataset — 16K examples, already prepared, needs some improvements

	---

	## 📖 Reading Guide

	### Start Here: 01-vision.md
	Understand WHAT we're building and WHY. This answers your core question:
	"How does Manus work and what are we replicating?"

	### Then: 02-research.md
	See the papers we found and WHY we made our choices. This teaches you
	how to do research for any ML project.

	### Then: 03-architecture.md
	Learn HOW the agent harness works — the ReAct loop, MCP protocol, tool registry,
	and how Manus's multi-agent design compares to our simpler approach.

	### Then: 04-training.md
	Understand HOW we train the model — LoRA, SFT, cross-entropy loss, backpropagation,
	and what each hyperparameter controls. This is the deepest technical chapter.

	### Then: 05-dataset.md
	Review our training data — what's good, what's missing, and how we'd improve it.
	This teaches you data quality assessment.

	### Then: 06-execution-plan.md
	See the EXACT step-by-step plan with timelines, costs, and decision points.
	This is our "project management" document.

	### Then: 07-tools-research.md
	Discover the 12+ tools we can add — browser automation, image generation, RAG,
	data analysis, and more. Ranked by wow factor and feasibility.

	### Then: 08-tool-ecosystem.md
	Learn how to add ANY tool dynamically without retraining. The `@tool` decorator,
	MCP servers, and the tool marketplace concept.

	### Finally: GUIDE_A_TO_Z.md
	The master reference combining all chapters into one document. Use this as a
	quick reference after reading the individual chapters.

	---

	## 🚀 When You're Ready

	When you've read all the files and feel confident, just say:

	> "START"

	And we'll begin building. Every step will be explained as we do it.

	---

	## 📁 File Structure

	```
	/project/
	├── 00-README.md ← You are here
	├── 01-vision.md ← The Vision & Manus comparison
	├── 02-research.md ← Papers, datasets & findings
	├── 03-architecture.md ← Agent harness & MCP protocol
	├── 04-training.md ← LoRA, SFT & hyperparameters
	├── 05-dataset.md ← Dataset analysis & improvements
	├── 06-execution-plan.md ← Step-by-step build plan
	├── 07-tools-research.md ← WOW tools: browser, RAG, image gen, etc.
	├── 08-tool-ecosystem.md ← How to add ANY tool dynamically
	├── GUIDE_A_TO_Z.md ← Master guide combining all chapters
	├── train.py ← Training script (generated when you say START)
	├── agent_app.py ← Gradio app (generated when you say START)
	└── datasets/ ← Training data & related files
	└── mcp-agent-training-data/
	```

	---

	Learning ML by building real things — one step at a time.
	Built by Muhammad Talha