MCP-Agent-1.7B / docs /00-README.md
muhammadtlha944's picture
Upload docs/00-README.md
a0855be verified

πŸ€– MCP-Agent-1.7B β€” Project Overview

Author: Muhammad Talha
Goal: Build a mini-Manus: a small language model fine-tuned for tool-calling, wrapped in an agent harness
Budget: ~$3 (fits well under $10)
Status: βœ… PLANNING COMPLETE β€” Waiting for your "START" signal


πŸ“š What You'll Learn (A-to-Z)

This project is designed to teach you every concept from the ground up. Read these files in order β€” each builds on the previous:

File Topic What You'll Learn Read Time
01-vision.md The Vision What Manus is, what we're building, why it matters 10 min
02-research.md Research Papers we found, datasets discovered, what works 10 min
03-architecture.md Architecture ReAct loop, MCP protocol, agent harness design 15 min
04-training.md Training LoRA, SFT, hyperparameters, why each matters 15 min
05-dataset.md Dataset What data we have, quality issues, how to improve 10 min
06-execution-plan.md Execution Exact step-by-step plan when you say START 10 min
07-tools-research.md WOW Tools Browser automation, image gen, RAG, data analysis, etc. 15 min
08-tool-ecosystem.md Tool Ecosystem How to add ANY tool dynamically, no retraining 15 min
GUIDE_A_TO_Z.md Master Guide Complete reference combining all chapters 30 min

Total reading time: ~130 minutes
Total build time: ~5-6 hours
Total cost: ~$1.50


🎯 The Big Picture

You asked: "How does Manus do it, and how can we build something similar?"

What Is Manus?

Manus (acquired by Meta) is an AI agent with three specialized sub-agents:

  1. Planner β€” Breaks tasks into steps
  2. Executor β€” Runs code, browses web, uses tools
  3. Verifier β€” Checks results, fixes errors

It runs in a cloud VM, works while you sleep, and can browse 50+ websites simultaneously.

What We're Building: "Mini-Manus"

We use ONE model (Qwen3-1.7B, 2B parameters) that plays all three roles:

  • We fine-tune it to natively understand tool-calling (MCP protocol)
  • We wrap it in a ReAct loop (think β†’ act β†’ observe β†’ repeat)
  • We give it real tools it can execute (shell, files, Python, web search)
  • We build a Gradio web app around it

The magic: The model doesn't call external MCP servers β€” it already KNOWS how to format tool calls because we trained it on 15,000 examples.

Why People Will Say "WOW"

  1. Runs locally β€” No API costs, no rate limits
  2. Actually DOES things β€” Not just chat, but real shell commands and file operations
  3. 100Γ— smaller than Manus's models β€” 1.7B vs 100B+ parameters
  4. Costs $3 β€” Not thousands
  5. YOU built it β€” From research β†’ data β†’ training β†’ app

πŸ’° Budget Breakdown

Item Cost Why
Training (T4 GPU, ~2h) ~$1.20 Fine-tuning with LoRA
Inference testing ~$0.30 Testing the model
Gradio Space (Zero GPU) $0 Free tier
Contingency ~$0.50 Buffer for retries
Total ~$2 Well under $10! βœ…

πŸ”¬ Research Highlights (From Our Deep Dive)

Papers That Back Our Approach

Paper Key Finding How We Use It
TinyAgent (arXiv:2409.00608) 1.1B model β‰ˆ GPT-4 at tool-calling Proves small models work
STAR (arXiv:2602.03022) Qwen3-1.7B beats Llama-3.1-8B Chose Qwen3 as base
Agent-World (arXiv:2604.18292) MCP-based training environments MCP is the right protocol
LoRA Without Regret (2025) all-linear LoRA = full fine-tuning Using target_modules="all-linear"

Datasets We Discovered

  • glaiveai/glaive-function-calling-v2 β€” 100K examples, most popular
  • Salesforce/xlam-function-calling β€” 60K diverse examples
  • Our dataset β€” 16K examples, already prepared, needs some improvements

πŸ“– Reading Guide

Start Here: 01-vision.md

Understand WHAT we're building and WHY. This answers your core question: "How does Manus work and what are we replicating?"

Then: 02-research.md

See the papers we found and WHY we made our choices. This teaches you how to do research for any ML project.

Then: 03-architecture.md

Learn HOW the agent harness works β€” the ReAct loop, MCP protocol, tool registry, and how Manus's multi-agent design compares to our simpler approach.

Then: 04-training.md

Understand HOW we train the model β€” LoRA, SFT, cross-entropy loss, backpropagation, and what each hyperparameter controls. This is the deepest technical chapter.

Then: 05-dataset.md

Review our training data β€” what's good, what's missing, and how we'd improve it. This teaches you data quality assessment.

Then: 06-execution-plan.md

See the EXACT step-by-step plan with timelines, costs, and decision points. This is our "project management" document.

Then: 07-tools-research.md

Discover the 12+ tools we can add β€” browser automation, image generation, RAG, data analysis, and more. Ranked by wow factor and feasibility.

Then: 08-tool-ecosystem.md

Learn how to add ANY tool dynamically without retraining. The @tool decorator, MCP servers, and the tool marketplace concept.

Finally: GUIDE_A_TO_Z.md

The master reference combining all chapters into one document. Use this as a quick reference after reading the individual chapters.


πŸš€ When You're Ready

When you've read all the files and feel confident, just say:

"START"

And we'll begin building. Every step will be explained as we do it.


πŸ“ File Structure

/project/
β”œβ”€β”€ 00-README.md           ← You are here
β”œβ”€β”€ 01-vision.md           ← The Vision & Manus comparison
β”œβ”€β”€ 02-research.md         ← Papers, datasets & findings
β”œβ”€β”€ 03-architecture.md     ← Agent harness & MCP protocol
β”œβ”€β”€ 04-training.md         ← LoRA, SFT & hyperparameters
β”œβ”€β”€ 05-dataset.md          ← Dataset analysis & improvements
β”œβ”€β”€ 06-execution-plan.md   ← Step-by-step build plan
β”œβ”€β”€ 07-tools-research.md   ← WOW tools: browser, RAG, image gen, etc.
β”œβ”€β”€ 08-tool-ecosystem.md   ← How to add ANY tool dynamically
β”œβ”€β”€ GUIDE_A_TO_Z.md        ← Master guide combining all chapters
β”œβ”€β”€ train.py               ← Training script (generated when you say START)
β”œβ”€β”€ agent_app.py           ← Gradio app (generated when you say START)
└── datasets/              ← Training data & related files
    └── mcp-agent-training-data/

Learning ML by building real things β€” one step at a time. Built by Muhammad Talha