MCP-Agent-1.7B / docs /08-tool-ecosystem.md

Upload docs/08-tool-ecosystem.md

c066b45 verified 10 days ago

23.3 kB

	# 08 — Dynamic Tool Ecosystem: How to Add ANY Tool

	## 🎯 What This Chapter Covers

	- How tools are registered in smolagents
	- How to add new tools without retraining the model
	- The tool marketplace concept (1000+ MCP servers)
	- How the agent discovers and uses new tools automatically
	- Architecture for a "tool marketplace" in our agent harness

	---

	## 🧩 The Core Principle: Pattern Over Specifics

	The #1 insight from our research: Our 1.7B model doesn't need to know about SPECIFIC tools. It needs to know the PATTERN of using tools.

	Think of it like this:
	- Bad approach: Train model on "how to use Tool A, Tool B, Tool C..."
	- Good approach: Train model on "how to write Python code that solves problems"

	The model already knows Python (Qwen3 was trained on code). We just need to teach it to:
	1. Break problems into steps
	2. Use available Python libraries/functions
	3. Handle errors and try alternatives

	Result: You can add ANY new tool (any Python function) and the model will figure out how to use it.

	---

	## 🔧 How Tool Registration Works in smolagents

	### The Simple Way: @tool Decorator

	```python
	from smolagents import tool

	@tool
	def my_awesome_tool(input_param: str) -> str:
	"""
	What this tool does (this becomes the "instruction manual" for the LLM).

	Args:
	input_param: What this parameter means
	"""
	# Your code here
	result = do_something(input_param)
	return result
	```

	That's it. The `@tool` decorator automatically:
	1. Reads the function name → becomes the tool name
	2. Reads the docstring → becomes the tool description (shown to the LLM)
	3. Reads type hints → becomes the parameter schema
	4. Registers it in the agent's "toolbox"

	### Example: Adding a Weather Tool

	```python
	from smolagents import tool
	import requests

	@tool
	def get_weather(city: str, country_code: str = "") -> str:
	"""
	Get current weather for a city. Returns temperature, conditions, and forecast.

	Args:
	city: The city name (e.g., "London", "New York")
	country_code: Optional 2-letter country code (e.g., "US", "GB")
	"""
	url = f"https://api.openweathermap.org/data/2.5/weather"
	params = {
	"q": f"{city},{country_code}" if country_code else city,
	"appid": "YOUR_API_KEY",
	"units": "metric"
	}
	response = requests.get(url, params=params)
	data = response.json()

	temp = data["main"]["temp"]
	conditions = data["weather"][0]["description"]
	humidity = data["main"]["humidity"]

	return f"Weather in {city}: {temp}°C, {conditions}, humidity {humidity}%"
	```

	What the LLM sees:
	```
	You have access to the following tools:

	- get_weather(city: str, country_code: str = "")
	Get current weather for a city. Returns temperature, conditions, and forecast.
	Args:
	city: The city name (e.g., "London", "New York")
	country_code: Optional 2-letter country code (e.g., "US", "GB")
	```

	The LLM learns to use it from the description alone. No training needed!

	---

	## 📦 The "Tool Marketplace" Concept

	### How It Works

	```
	┌─────────────────────────────────────────────┐
	│ Tool Marketplace │
	│ │
	│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
	│ │ Weather │ │ Finance │ │ Social │ │
	│ │ Tool │ │ Tool │ │ Media │ │
	│ │ (Free) │ │ (Free) │ │ (Free) │ │
	│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
	│ │ │ │ │
	│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
	│ │ Browser │ │ GitHub │ │ Image │ │
	│ │ Tool │ │ Tool │ │ Gen Tool │ │
	│ │ (Built) │ │ (Built) │ │ (Built) │ │
	│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
	│ │ │ │ │
	│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
	│ │ Database │ │ Email │ │ Calendar │ │
	│ │ Tool │ │ Tool │ │ Tool │ │
	│ │ (Built) │ │ (Built) │ │ (Built) │ │
	│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
	│ │ │ │ │
	│ └──────┬──────┴──────┬────┘ │
	│ │ │ │
	│ ▼ ▼ │
	│ ┌─────────────────────────┐ │
	│ │ Agent Tool Loader │ │
	│ │ (User picks which │ │
	│ │ tools to enable) │ │
	│ └────────────┬────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────────────────┐ │
	│ │ CodeAgent with Tools │ │
	│ │ (Model sees all enabled│ │
	│ │ tool descriptions) │ │
	│ └─────────────────────────┘ │
	└─────────────────────────────────────────────┘
	```

	### Adding a Tool Is Just Installing a Package

	```bash
	# Install weather tool
	pip install some-weather-library

	# Add to agent config
	# (The tool is auto-registered via @tool decorator)
	```

	Or for MCP servers:
	```bash
	# Install MCP server
	npm install -g @some-org/mcp-weather

	# Register in agent
	# Agent discovers tools from the MCP server's tool definitions
	```

	---

	## 🛠️ Building Our Tool Ecosystem

	### Phase 1: Core Tools (Built Into Agent)

	These are always available — the foundation:

	```python
	# core_tools.py
	from smolagents import tool
	import os, subprocess, json

	@tool
	def read_file(file_path: str, max_chars: int = 10000) -> str:
	"""Read contents of a file."""
	with open(file_path, 'r') as f:
	return f.read()[:max_chars]

	@tool
	def write_file(file_path: str, content: str) -> str:
	"""Write content to a file."""
	os.makedirs(os.path.dirname(file_path) or '.', exist_ok=True)
	with open(file_path, 'w') as f:
	f.write(content)
	return f"Written {len(content)} chars to {file_path}"

	@tool
	def list_directory(path: str = '.') -> str:
	"""List files and folders in a directory."""
	entries = os.listdir(path)
	return "\n".join(sorted(entries))

	@tool
	def execute_shell(command: str) -> str:
	"""Execute a shell command safely."""
	result = subprocess.run(command, shell=True, capture_output=True, text=True, timeout=30)
	return result.stdout + result.stderr
	```

	### Phase 2: Web Tools (Internet Access)

	```python
	# web_tools.py
	from smolagents import tool
	import requests
	from bs4 import BeautifulSoup

	@tool
	def fetch_webpage(url: str) -> str:
	"""Fetch and extract text content from a webpage."""
	response = requests.get(url, timeout=30)
	soup = BeautifulSoup(response.text, 'html.parser')
	# Remove scripts and styles
	for script in soup(["script", "style"]):
	script.decompose()
	return soup.get_text()[:10000]

	@tool
	def web_search(query: str, num_results: int = 5) -> str:
	"""Search the web using DuckDuckGo."""
	# Using DuckDuckGo's HTML interface
	response = requests.get(f"https://html.duckduckgo.com/html/?q={query}")
	# Parse results...
	return formatted_results
	```

	### Phase 3: Analysis Tools (Data Processing)

	```python
	# analysis_tools.py
	from smolagents import tool
	import pandas as pd
	import matplotlib.pyplot as plt

	@tool
	def analyze_csv(file_path: str, query: str) -> str:
	"""Load a CSV file and answer questions about it using pandas."""
	df = pd.read_csv(file_path)
	# Agent writes code to analyze
	# Could generate summary stats, charts, etc.
	return str(df.describe())

	@tool
	def create_chart(data_source: str, chart_type: str, output_path: str) -> str:
	"""Create a chart from data."""
	# chart_type: "bar", "line", "pie", "scatter"
	# Agent writes matplotlib code
	# Saves to output_path
	return output_path
	```

	### Phase 4: Creative Tools (Generation)

	```python
	# creative_tools.py
	from smolagents import tool
	from diffusers import StableDiffusionPipeline
	import torch

	# Load model once at startup
	pipe = StableDiffusionPipeline.from_pretrained(
	"runwayml/stable-diffusion-v1-5",
	torch_dtype=torch.float16
	).to("cuda")

	@tool
	def generate_image(prompt: str, output_path: str = "generated.png") -> str:
	"""Generate an image from a text description."""
	image = pipe(prompt, num_inference_steps=20).images[0]
	image.save(output_path)
	return output_path

	@tool
	def generate_code(language: str, task: str, output_path: str) -> str:
	"""Generate code for a specific task."""
	# Uses the LLM itself to generate code
	# Then saves to file
	return output_path
	```

	---

	## 🧠 How the Agent Uses Tools It Never Saw Before

	### Example: User Adds a "Crypto Price" Tool

	Step 1: User installs the tool
	```bash
	pip install crypto-price-library
	```

	Step 2: User writes the tool wrapper
	```python
	from smolagents import tool
	import crypto_price

	@tool
	def get_crypto_price(symbol: str, currency: str = "USD") -> str:
	"""
	Get the current price of a cryptocurrency.

	Args:
	symbol: The cryptocurrency symbol (e.g., "BTC", "ETH", "SOL")
	currency: The currency to convert to (default: "USD")
	"""
	price = crypto_price.get_current(symbol, currency)
	return f"{symbol}: ${price} {currency}"
	```

	Step 3: Register with agent
	```python
	from smolagents import CodeAgent
	from my_tools import get_crypto_price, read_file, write_file

	agent = CodeAgent(
	model=my_model,
	tools=[get_crypto_price, read_file, write_file],
	)
	```

	Step 4: The agent automatically learns

	```
	User: "What's the price of Bitcoin and should I invest?"

	Agent's system prompt now includes:
	"- get_crypto_price(symbol: str, currency: str = 'USD')
	Get the current price of a cryptocurrency."

	Agent thinks (via CodeAgent pattern):
	"User wants Bitcoin price. I have get_crypto_price tool.
	I'll call it with symbol='BTC'."

	Agent generates Python code:
	```python
	btc_price = get_crypto_price("BTC", "USD")
	print(btc_price)
	```

	Result: "BTC: $67,420 USD"

	Agent then might:
	"User also asked if they should invest. I should do more research.
	Let me search for recent Bitcoin news and analysis."

	Agent generates:
	```python
	news = web_search("Bitcoin investment analysis 2025")
	print(news)
	```

	Agent synthesizes final answer:
	"Bitcoin is currently $67,420. Recent analysis suggests...
	[summary of research]"
	```

	No retraining needed. The model learns to use the tool from its name, description, and parameter hints.

	---

	## 🌐 The MCP Server Universe

	### What Are MCP Servers?

	MCP (Model Context Protocol) servers are pre-built tool packages that expose tools in a standard format. Think of them like "apps" for your agent.

	There are 1000+ MCP servers covering every domain:

	\| Category \| Example Servers \| What They Do \|
	\|----------\|----------------\|--------------\|
	\| Web \| firecrawl, browser-use, playwright \| Web scraping, browsing \|
	\| Code \| github, git, code-index \| Repo analysis, code search \|
	\| Data \| postgres, sqlite, duckdb \| Database queries \|
	\| Memory \| chroma, mem0 \| Long-term memory \|
	\| Comm \| slack, gmail, discord \| Messaging \|
	\| Dev \| kubernetes, docker, aws \| Infrastructure \|
	\| Creative \| comfyui, image-gen \| Image/video generation \|
	\| Research \| perplexity, arxiv \| Academic search \|

	### How to Use MCP Servers

	```python
	# Install MCP server
	# npm install -g @modelcontextprotocol/server-filesystem

	# In Python, use the MCP client
	from mcp import ClientSession, StdioServerParameters
	from mcp.client.stdio import stdio_client

	# Connect to MCP server
	server_params = StdioServerParameters(
	command="npx",
	args=["-y", "@modelcontextprotocol/server-filesystem", "/home/user"]
	)

	async with stdio_client(server_params) as (read, write):
	async with ClientSession(read, write) as session:
	# List available tools
	tools = await session.list_tools()

	# Call a tool
	result = await session.call_tool("read_file", {"path": "/home/user/doc.txt"})
	```

	The MCP server exposes its tools as Python functions that smolagents can use.

	---

	## 🎛️ Dynamic Tool Loading: The "Plugin System"

	### Architecture for Loading Tools at Runtime

	```python
	# tool_loader.py
	import os
	import importlib
	from smolagents import tool, CodeAgent

	def load_tools_from_directory(directory: str):
	"""Dynamically load all tools from a directory."""
	tools = []

	for filename in os.listdir(directory):
	if filename.endswith('_tools.py'):
	module_name = filename[:-3] # Remove .py
	module = importlib.import_module(f"tools.{module_name}")

	# Find all @tool decorated functions
	for attr_name in dir(module):
	attr = getattr(module, attr_name)
	if hasattr(attr, '_is_smolagents_tool'):
	tools.append(attr)

	return tools

	# Usage
	custom_tools = load_tools_from_directory('./tools')

	agent = CodeAgent(
	model=my_model,
	tools=custom_tools + [read_file, write_file], # Core + custom
	)
	```

	### Tool Configuration File

	Users can enable/disable tools via a config:

	```json
	{
	"agent_name": "My Mini-Manus",
	"enabled_tools": [
	"core:read_file",
	"core:write_file",
	"core:execute_shell",
	"web:fetch_webpage",
	"web:web_search",
	"analysis:analyze_csv",
	"creative:generate_image",
	"mcp:github",
	"mcp:slack"
	],
	"max_iterations": 10,
	"model": "muhammadtlha944/MCP-Agent-1.7B"
	}
	```

	---

	## 📊 Tool Complexity vs Model Capability

	### What a 1.7B Model Can Handle

	\| Tool Complexity \| Can Use? \| Notes \|
	\|----------------\|----------\|-------\|
	\| Simple function (1 param, 1 return) \| ✅ Yes \| Easy — model gets it from description \|
	\| Multi-param function (3-5 params) \| ✅ Yes \| With clear descriptions \|
	\| Chain of 2-3 tools \| ✅ Yes \| With ReAct loop \|
	\| Chain of 5+ tools \| ⚠️ Maybe \| Depends on context length \|
	\| Complex logic (loops, if/else) \| ✅ Yes \| CodeAgent handles this well \|
	\| API calls with auth \| ✅ Yes \| If keys are pre-configured \|
	\| Browser automation \| ✅ Yes \| With Helium/Selenium abstraction \|
	\| Vision/image understanding \| ⚠️ Maybe \| Needs vision model (adds VRAM) \|
	\| Real-time streaming \| ❌ No \| Too complex for 1.7B \|
	\| Multi-agent coordination \| ⚠️ Maybe \| smolagents multi-agent can help \|

	### Rule of Thumb

	If you can describe the tool in 2-3 sentences and it has 1-5 parameters,
	a 1.7B model can learn to use it from the description alone.

	---

	## 🚀 The Complete Architecture

	```
	┌───────────────────────────────────────────────────────────┐
	│ User Interface │
	│ (Gradio Web App) │
	└───────────────────┬───────────────────────────────────────┘
	│
	▼
	┌───────────────────────────────────────────────────────────┐
	│ Agent Controller │
	│ ┌─────────────────────────────────────────────────────┐ │
	│ │ CodeAgent (Qwen3-1.7B) │ │
	│ │ │ │
	│ │ System Prompt: │ │
	│ │ "You are an AI assistant. Use available tools │ │
	│ │ to solve problems. Write Python code." │ │
	│ │ │ │
	│ │ Memory: Conversation history + tool results │ │
	│ └─────────────────────────────────────────────────────┘ │
	│ │ │
	│ ┌────────────────────────┼────────────────────────────┐ │
	│ │ ▼ │ │
	│ │ ┌─────────────────────────────────────────────┐ │ │
	│ │ │ Tool Registry │ │ │
	│ │ │ │ │ │
	│ │ │ Core Tools Custom Tools │ │ │
	│ │ │ ├─ read_file ├─ get_weather │ │ │
	│ │ │ ├─ write_file ├─ fetch_webpage │ │ │
	│ │ │ ├─ list_dir ├─ analyze_csv │ │ │
	│ │ │ ├─ shell_exec ├─ generate_image │ │ │
	│ │ │ ├─ web_search ├─ create_presentation │ │ │
	│ │ │ └─ python_exec └─ [user adds more!] │ │ │
	│ │ │ │ │ │
	│ │ │ MCP Servers (external): │ │ │
	│ │ │ ├─ github-mcp-server │ │ │
	│ │ │ ├─ slack-mcp-server │ │ │
	│ │ │ └─ [any MCP server] │ │ │
	│ │ └─────────────────────────────────────────────┘ │ │
	│ │ │ │ │
	│ │ ▼ │ │
	│ │ ┌─────────────────────────────────────────────┐ │ │
	│ │ │ Tool Implementations │ │ │
	│ │ │ │ │ │
	│ │ │ Python Libraries: │ │ │
	│ │ │ ├─ requests (HTTP) │ │ │
	│ │ │ ├─ pandas (data) │ │ │
	│ │ │ ├─ matplotlib (charts) │ │ │
	│ │ │ ├─ selenium/helium (browser) │ │ │
	│ │ │ ├─ diffusers (image gen) │ │ │
	│ │ │ └─ [any Python library!] │ │ │
	│ │ │ │ │ │
	│ │ │ System Tools: │ │ │
	│ │ │ ├─ git │ │ │
	│ │ │ ├─ ffmpeg │ │ │
	│ │ │ └─ [any CLI tool!] │ │ │
	│ │ └─────────────────────────────────────────────┘ │ │
	│ └────────────────────────────────────────────────────┘ │
	└───────────────────────────────────────────────────────────┘
	```

	---

	## 📝 Summary: Adding Tools Is Just Python

	\| Step \| What You Do \| Time \|
	\|------\|-------------\|------\|
	\| 1 \| Write a Python function with `@tool` decorator \| 5 min \|
	\| 2 \| Write a good docstring (this teaches the LLM!) \| 5 min \|
	\| 3 \| Add it to your agent's tools list \| 1 min \|
	\| 4 \| Test it \| 5 min \|
	\| Total \| \| 16 min per tool \|

	No retraining. No model changes. Just write Python.

	---

	## 🎓 Key Takeaways

	1. Tools are just Python functions — write them, decorate with `@tool`, done
	2. The LLM learns from docstrings — the description teaches the model how to use it
	3. No retraining needed — add/remove tools anytime
	4. MCP servers = pre-built tools — 1000+ available, install and use
	5. CodeAgent writes Python to use tools — more flexible than JSON tool calls
	6. 1.7B model handles 90% of tools — anything with clear description + 1-5 params
	7. Dynamic loading — tool marketplace concept: enable/disable tools via config

	---

	## 🔜 How This Changes Our Project

	### Original Plan
	- Train model to generate JSON tool calls (MCP format)
	- Build manual ReAct loop
	- Hardcode tool registry
	- Limited to trained tools

	### New Plan (Based on Research)
	- Train model to solve problems by writing Python (it already knows Python!)
	- Use smolagents CodeAgent (handles ReAct loop)
	- Dynamic tool registration via `@tool` decorator
	- Unlimited tools — add any Python function anytime
	- Leverage 1000+ MCP servers
	- Use built-in GradioUI for the web app

	### Training Focus Changes

	Instead of teaching: "Generate JSON tool calls in MCP format"
	We teach: "Break problems into steps, write Python code, use available functions"

	Benefits:
	- Less training data needed (model already knows Python)
	- More flexible (any tool works, not just trained ones)
	- Easier to add tools later (just write Python)
	- More natural for the model (code is easier than JSON schemas)

	---

	This is the final piece of our planning. You now have the complete picture: vision, research, architecture, training, dataset, execution plan, tool ecosystem, and dynamic tool loading.

	When you're ready: say "START" and we build! 🚀