muhammadtlha944
/

MCP-Agent-1.7B

Model card Files Files and versions

xet

Community

muhammadtlha944 commited on 26 days ago

Commit

c066b45

verified ·

1 Parent(s): 40d8168

Upload docs/08-tool-ecosystem.md

Browse files

Files changed (1) hide show

docs/08-tool-ecosystem.md +619 -0

docs/08-tool-ecosystem.md ADDED Viewed

	@@ -0,0 +1,619 @@

+# 08 — Dynamic Tool Ecosystem: How to Add ANY Tool
+## 🎯 What This Chapter Covers
+- How tools are registered in smolagents
+- How to add new tools without retraining the model
+- The tool marketplace concept (1000+ MCP servers)
+- How the agent discovers and uses new tools automatically
+- Architecture for a "tool marketplace" in our agent harness
+---
+## 🧩 The Core Principle: Pattern Over Specifics
+**The #1 insight from our research:** Our 1.7B model doesn't need to know about SPECIFIC tools. It needs to know the PATTERN of using tools.
+Think of it like this:
+- **Bad approach:** Train model on "how to use Tool A, Tool B, Tool C..."
+- **Good approach:** Train model on "how to write Python code that solves problems"
+The model already knows Python (Qwen3 was trained on code). We just need to teach it to:
+1. Break problems into steps
+2. Use available Python libraries/functions
+3. Handle errors and try alternatives
+**Result:** You can add ANY new tool (any Python function) and the model will figure out how to use it.
+---
+## 🔧 How Tool Registration Works in smolagents
+### The Simple Way: @tool Decorator
+```python
+from smolagents import tool
+@tool
+def my_awesome_tool(input_param: str) -> str:
+    """
+    What this tool does (this becomes the "instruction manual" for the LLM).
+    Args:
+        input_param: What this parameter means
+    """
+    # Your code here
+    result = do_something(input_param)
+    return result
+```
+**That's it.** The `@tool` decorator automatically:
+1. Reads the function name → becomes the tool name
+2. Reads the docstring → becomes the tool description (shown to the LLM)
+3. Reads type hints → becomes the parameter schema
+4. Registers it in the agent's "toolbox"
+### Example: Adding a Weather Tool
+```python
+from smolagents import tool
+import requests
+@tool
+def get_weather(city: str, country_code: str = "") -> str:
+    """
+    Get current weather for a city. Returns temperature, conditions, and forecast.
+    Args:
+        city: The city name (e.g., "London", "New York")
+        country_code: Optional 2-letter country code (e.g., "US", "GB")
+    """
+    url = f"https://api.openweathermap.org/data/2.5/weather"
+    params = {
+        "q": f"{city},{country_code}" if country_code else city,
+        "appid": "YOUR_API_KEY",
+        "units": "metric"
+    }
+    response = requests.get(url, params=params)
+    data = response.json()
+    temp = data["main"]["temp"]
+    conditions = data["weather"][0]["description"]
+    humidity = data["main"]["humidity"]
+    return f"Weather in {city}: {temp}°C, {conditions}, humidity {humidity}%"
+```
+**What the LLM sees:**
+```
+You have access to the following tools:
+- get_weather(city: str, country_code: str = "")
+  Get current weather for a city. Returns temperature, conditions, and forecast.
+  Args:
+    city: The city name (e.g., "London", "New York")
+    country_code: Optional 2-letter country code (e.g., "US", "GB")
+```
+**The LLM learns to use it from the description alone.** No training needed!
+---
+## 📦 The "Tool Marketplace" Concept
+### How It Works
+```
+┌─────────────────────────────────────────────┐
+│           Tool Marketplace                     │
+│                                              │
+│  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
+│  │ Weather  │  │ Finance  │  │ Social   │   │
+│  │ Tool     │  │ Tool     │  │ Media    │   │
+│  │ (Free)   │  │ (Free)   │  │ (Free)   │   │
+│  └────┬─────┘  └────┬─────┘  └────┬─────┘   │
+│       │             │             │          │
+│  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
+│  │ Browser  │  │ GitHub   │  │ Image    │   │
+│  │ Tool     │  │ Tool     │  │ Gen Tool │   │
+│  │ (Built)  │  │ (Built)  │  │ (Built)  │   │
+│  └────┬─────┘  └────┬─────┘  └────┬─────┘   │
+│       │             │             │          │
+│  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
+│  │ Database │  │ Email    │  │ Calendar │   │
+│  │ Tool     │  │ Tool     │  │ Tool     │   │
+│  │ (Built)  │  │ (Built)  │  │ (Built)  │   │
+│  └────┬─────┘  └────┬─────┘  └────┬─────┘   │
+│       │             │             │          │
+│       └──────┬──────┴──────┬────┘          │
+│              │             │                │
+│              ▼             ▼                │
+│       ┌─────────────────────────┐           │
+│       │   Agent Tool Loader     │           │
+│       │   (User picks which     │           │
+│       │    tools to enable)     │           │
+│       └────────────┬────────────┘           │
+│                    │                        │
+│                    ▼                        │
+│       ┌─────────────────────────┐           │
+│       │  CodeAgent with Tools   │           │
+│       │  (Model sees all enabled│           │
+│       │   tool descriptions)    │           │
+│       └─────────────────────────┘           │
+└─────────────────────────────────────────────┘
+```
+### Adding a Tool Is Just Installing a Package
+```bash
+# Install weather tool
+pip install some-weather-library
+# Add to agent config
+# (The tool is auto-registered via @tool decorator)
+```
+Or for MCP servers:
+```bash
+# Install MCP server
+npm install -g @some-org/mcp-weather
+# Register in agent
+# Agent discovers tools from the MCP server's tool definitions
+```
+---
+## 🛠️ Building Our Tool Ecosystem
+### Phase 1: Core Tools (Built Into Agent)
+These are always available — the foundation:
+```python
+# core_tools.py
+from smolagents import tool
+import os, subprocess, json
+@tool
+def read_file(file_path: str, max_chars: int = 10000) -> str:
+    """Read contents of a file."""
+    with open(file_path, 'r') as f:
+        return f.read()[:max_chars]
+@tool
+def write_file(file_path: str, content: str) -> str:
+    """Write content to a file."""
+    os.makedirs(os.path.dirname(file_path) or '.', exist_ok=True)
+    with open(file_path, 'w') as f:
+        f.write(content)
+    return f"Written {len(content)} chars to {file_path}"
+@tool
+def list_directory(path: str = '.') -> str:
+    """List files and folders in a directory."""
+    entries = os.listdir(path)
+    return "\n".join(sorted(entries))
+@tool
+def execute_shell(command: str) -> str:
+    """Execute a shell command safely."""
+    result = subprocess.run(command, shell=True, capture_output=True, text=True, timeout=30)
+    return result.stdout + result.stderr
+```
+### Phase 2: Web Tools (Internet Access)
+```python
+# web_tools.py
+from smolagents import tool
+import requests
+from bs4 import BeautifulSoup
+@tool
+def fetch_webpage(url: str) -> str:
+    """Fetch and extract text content from a webpage."""
+    response = requests.get(url, timeout=30)
+    soup = BeautifulSoup(response.text, 'html.parser')
+    # Remove scripts and styles
+    for script in soup(["script", "style"]):
+        script.decompose()
+    return soup.get_text()[:10000]
+@tool
+def web_search(query: str, num_results: int = 5) -> str:
+    """Search the web using DuckDuckGo."""
+    # Using DuckDuckGo's HTML interface
+    response = requests.get(f"https://html.duckduckgo.com/html/?q={query}")
+    # Parse results...
+    return formatted_results
+```
+### Phase 3: Analysis Tools (Data Processing)
+```python
+# analysis_tools.py
+from smolagents import tool
+import pandas as pd
+import matplotlib.pyplot as plt
+@tool
+def analyze_csv(file_path: str, query: str) -> str:
+    """Load a CSV file and answer questions about it using pandas."""
+    df = pd.read_csv(file_path)
+    # Agent writes code to analyze
+    # Could generate summary stats, charts, etc.
+    return str(df.describe())
+@tool
+def create_chart(data_source: str, chart_type: str, output_path: str) -> str:
+    """Create a chart from data."""
+    # chart_type: "bar", "line", "pie", "scatter"
+    # Agent writes matplotlib code
+    # Saves to output_path
+    return output_path
+```
+### Phase 4: Creative Tools (Generation)
+```python
+# creative_tools.py
+from smolagents import tool
+from diffusers import StableDiffusionPipeline
+import torch
+# Load model once at startup
+pipe = StableDiffusionPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5",
+    torch_dtype=torch.float16
+).to("cuda")
+@tool
+def generate_image(prompt: str, output_path: str = "generated.png") -> str:
+    """Generate an image from a text description."""
+    image = pipe(prompt, num_inference_steps=20).images[0]
+    image.save(output_path)
+    return output_path
+@tool
+def generate_code(language: str, task: str, output_path: str) -> str:
+    """Generate code for a specific task."""
+    # Uses the LLM itself to generate code
+    # Then saves to file
+    return output_path
+```
+---
+## 🧠 How the Agent Uses Tools It Never Saw Before
+### Example: User Adds a "Crypto Price" Tool
+**Step 1: User installs the tool**
+```bash
+pip install crypto-price-library
+```
+**Step 2: User writes the tool wrapper**
+```python
+from smolagents import tool
+import crypto_price
+@tool
+def get_crypto_price(symbol: str, currency: str = "USD") -> str:
+    """
+    Get the current price of a cryptocurrency.
+    Args:
+        symbol: The cryptocurrency symbol (e.g., "BTC", "ETH", "SOL")
+        currency: The currency to convert to (default: "USD")
+    """
+    price = crypto_price.get_current(symbol, currency)
+    return f"{symbol}: ${price} {currency}"
+```
+**Step 3: Register with agent**
+```python
+from smolagents import CodeAgent
+from my_tools import get_crypto_price, read_file, write_file
+agent = CodeAgent(
+    model=my_model,
+    tools=[get_crypto_price, read_file, write_file],
+)
+```
+**Step 4: The agent automatically learns**
+```
+User: "What's the price of Bitcoin and should I invest?"
+Agent's system prompt now includes:
+  "- get_crypto_price(symbol: str, currency: str = 'USD')
+    Get the current price of a cryptocurrency."
+Agent thinks (via CodeAgent pattern):
+  "User wants Bitcoin price. I have get_crypto_price tool.
+   I'll call it with symbol='BTC'."
+Agent generates Python code:
+  ```python
+  btc_price = get_crypto_price("BTC", "USD")
+  print(btc_price)
+  ```
+Result: "BTC: $67,420 USD"
+Agent then might:
+  "User also asked if they should invest. I should do more research.
+   Let me search for recent Bitcoin news and analysis."
+Agent generates:
+  ```python
+  news = web_search("Bitcoin investment analysis 2025")
+  print(news)
+  ```
+Agent synthesizes final answer:
+  "Bitcoin is currently $67,420. Recent analysis suggests...
+   [summary of research]"
+```
+**No retraining needed.** The model learns to use the tool from its name, description, and parameter hints.
+---
+## 🌐 The MCP Server Universe
+### What Are MCP Servers?
+MCP (Model Context Protocol) servers are **pre-built tool packages** that expose tools in a standard format. Think of them like "apps" for your agent.
+There are **1000+ MCP servers** covering every domain:
+| Category | Example Servers | What They Do |
+|----------|----------------|--------------|
+| **Web** | firecrawl, browser-use, playwright | Web scraping, browsing |
+| **Code** | github, git, code-index | Repo analysis, code search |
+| **Data** | postgres, sqlite, duckdb | Database queries |
+| **Memory** | chroma, mem0 | Long-term memory |
+| **Comm** | slack, gmail, discord | Messaging |
+| **Dev** | kubernetes, docker, aws | Infrastructure |
+| **Creative** | comfyui, image-gen | Image/video generation |
+| **Research** | perplexity, arxiv | Academic search |
+### How to Use MCP Servers
+```python
+# Install MCP server
+# npm install -g @modelcontextprotocol/server-filesystem
+# In Python, use the MCP client
+from mcp import ClientSession, StdioServerParameters
+from mcp.client.stdio import stdio_client
+# Connect to MCP server
+server_params = StdioServerParameters(
+    command="npx",
+    args=["-y", "@modelcontextprotocol/server-filesystem", "/home/user"]
+)
+async with stdio_client(server_params) as (read, write):
+    async with ClientSession(read, write) as session:
+        # List available tools
+        tools = await session.list_tools()
+        # Call a tool
+        result = await session.call_tool("read_file", {"path": "/home/user/doc.txt"})
+```
+**The MCP server exposes its tools as Python functions that smolagents can use.**
+---
+## 🎛️ Dynamic Tool Loading: The "Plugin System"
+### Architecture for Loading Tools at Runtime
+```python
+# tool_loader.py
+import os
+import importlib
+from smolagents import tool, CodeAgent
+def load_tools_from_directory(directory: str):
+    """Dynamically load all tools from a directory."""
+    tools = []
+    for filename in os.listdir(directory):
+        if filename.endswith('_tools.py'):
+            module_name = filename[:-3]  # Remove .py
+            module = importlib.import_module(f"tools.{module_name}")
+            # Find all @tool decorated functions
+            for attr_name in dir(module):
+                attr = getattr(module, attr_name)
+                if hasattr(attr, '_is_smolagents_tool'):
+                    tools.append(attr)
+    return tools
+# Usage
+custom_tools = load_tools_from_directory('./tools')
+agent = CodeAgent(
+    model=my_model,
+    tools=custom_tools + [read_file, write_file],  # Core + custom
+)
+```
+### Tool Configuration File
+Users can enable/disable tools via a config:
+```json
+{
+  "agent_name": "My Mini-Manus",
+  "enabled_tools": [
+    "core:read_file",
+    "core:write_file",
+    "core:execute_shell",
+    "web:fetch_webpage",
+    "web:web_search",
+    "analysis:analyze_csv",
+    "creative:generate_image",
+    "mcp:github",
+    "mcp:slack"
+  ],
+  "max_iterations": 10,
+  "model": "muhammadtlha944/MCP-Agent-1.7B"
+}
+```
+---
+## 📊 Tool Complexity vs Model Capability
+### What a 1.7B Model Can Handle
+| Tool Complexity | Can Use? | Notes |
+|----------------|----------|-------|
+| **Simple function** (1 param, 1 return) | ✅ Yes | Easy — model gets it from description |
+| **Multi-param function** (3-5 params) | ✅ Yes | With clear descriptions |
+| **Chain of 2-3 tools** | ✅ Yes | With ReAct loop |
+| **Chain of 5+ tools** | ⚠️ Maybe | Depends on context length |
+| **Complex logic** (loops, if/else) | ✅ Yes | CodeAgent handles this well |
+| **API calls with auth** | ✅ Yes | If keys are pre-configured |
+| **Browser automation** | ✅ Yes | With Helium/Selenium abstraction |
+| **Vision/image understanding** | ⚠️ Maybe | Needs vision model (adds VRAM) |
+| **Real-time streaming** | ❌ No | Too complex for 1.7B |
+| **Multi-agent coordination** | ⚠️ Maybe | smolagents multi-agent can help |
+### Rule of Thumb
+If you can describe the tool in 2-3 sentences and it has 1-5 parameters,
+a 1.7B model can learn to use it from the description alone.
+---
+## 🚀 The Complete Architecture
+```
+┌───────────────────────────────────────────────────────────┐
+│                    User Interface                          │
+│                   (Gradio Web App)                         │
+└───────────────────┬───────────────────────────────────────┘
+                    │
+                    ▼
+┌───────────────────────────────────────────────────────────┐
+│                   Agent Controller                           │
+│  ┌─────────────────────────────────────────────────────┐  │
+│  │  CodeAgent (Qwen3-1.7B)                              │  │
+│  │                                                      │  │
+│  │  System Prompt:                                      │  │
+│  │  "You are an AI assistant. Use available tools       │  │
+│  │   to solve problems. Write Python code."            │  │
+│  │                                                      │  │
+│  │  Memory: Conversation history + tool results          │  │
+│  └─────────────────────────────────────────────────────┘  │
+│                           │                                │
+│  ┌────────────────────────┼────────────────────────────┐  │
+│  │                        ▼                             │  │
+│  │  ┌─────────────────────────────────────────────┐   │  │
+│  │  │         Tool Registry                        │   │  │
+│  │  │                                              │   │  │
+│  │  │  Core Tools          Custom Tools            │   │  │
+│  │  │  ├─ read_file        ├─ get_weather          │   │  │
+│  │  │  ├─ write_file       ├─ fetch_webpage        │   │  │
+│  │  │  ├─ list_dir         ├─ analyze_csv           │   │  │
+│  │  │  ├─ shell_exec       ├─ generate_image         │   │  │
+│  │  │  ├─ web_search       ├─ create_presentation    │   │  │
+│  │  │  └─ python_exec      └─ [user adds more!]     │   │  │
+│  │  │                                              │   │  │
+│  │  │  MCP Servers (external):                     │   │  │
+│  │  │  ├─ github-mcp-server                         │   │  │
+│  │  │  ├─ slack-mcp-server                          │   │  │
+│  │  │  └─ [any MCP server]                          │   │  │
+│  │  └─────────────────────────────────────────────┘   │  │
+│  │                        │                         │  │
+│  │                        ▼                         │  │
+│  │  ┌─────────────────────────────────────────────┐   │  │
+│  │  │         Tool Implementations               │   │  │
+│  │  │                                              │   │  │
+│  │  │  Python Libraries:                           │   │  │
+│  │  │  ├─ requests (HTTP)                          │   │  │
+│  │  │  ├─ pandas (data)                            │   │  │
+│  │  │  ├�� matplotlib (charts)                      │   │  │
+│  │  │  ├─ selenium/helium (browser)                │   │  │
+│  │  │  ├─ diffusers (image gen)                    │   │  │
+│  │  │  └─ [any Python library!]                    │   │  │
+│  │  │                                              │   │  │
+│  │  │  System Tools:                               │   │  │
+│  │  │  ├─ git                                      │   │  │
+│  │  │  ├─ ffmpeg                                   │   │  │
+│  │  │  └─ [any CLI tool!]                          │   │  │
+│  │  └─────────────────────────────────────────────┘   │  │
+│  └────────────────────────────────────────────────────┘  │
+└───────────────────────────────────────────────────────────┘
+```
+---
+## 📝 Summary: Adding Tools Is Just Python
+| Step | What You Do | Time |
+|------|-------------|------|
+| 1 | Write a Python function with `@tool` decorator | 5 min |
+| 2 | Write a good docstring (this teaches the LLM!) | 5 min |
+| 3 | Add it to your agent's tools list | 1 min |
+| 4 | Test it | 5 min |
+| **Total** | | **16 min per tool** |
+**No retraining. No model changes. Just write Python.**
+---
+## 🎓 Key Takeaways
+1. **Tools are just Python functions** — write them, decorate with `@tool`, done
+2. **The LLM learns from docstrings** — the description teaches the model how to use it
+3. **No retraining needed** — add/remove tools anytime
+4. **MCP servers = pre-built tools** — 1000+ available, install and use
+5. **CodeAgent writes Python to use tools** — more flexible than JSON tool calls
+6. **1.7B model handles 90% of tools** — anything with clear description + 1-5 params
+7. **Dynamic loading** — tool marketplace concept: enable/disable tools via config
+---
+## 🔜 How This Changes Our Project
+### Original Plan
+- Train model to generate JSON tool calls (MCP format)
+- Build manual ReAct loop
+- Hardcode tool registry
+- Limited to trained tools
+### New Plan (Based on Research)
+- Train model to solve problems by writing Python (it already knows Python!)
+- Use smolagents CodeAgent (handles ReAct loop)
+- Dynamic tool registration via `@tool` decorator
+- Unlimited tools — add any Python function anytime
+- Leverage 1000+ MCP servers
+- Use built-in GradioUI for the web app
+### Training Focus Changes
+**Instead of teaching:** "Generate JSON tool calls in MCP format"
+**We teach:** "Break problems into steps, write Python code, use available functions"
+**Benefits:**
+- Less training data needed (model already knows Python)
+- More flexible (any tool works, not just trained ones)
+- Easier to add tools later (just write Python)
+- More natural for the model (code is easier than JSON schemas)
+---
+*This is the final piece of our planning. You now have the complete picture: vision, research, architecture, training, dataset, execution plan, tool ecosystem, and dynamic tool loading.*
+**When you're ready: say "START" and we build! 🚀**