muhammadtlha944 commited on
Commit
ee54ebc
Β·
verified Β·
1 Parent(s): 2a34d74

Add Colab training notebook (free GPU)

Browse files
Files changed (1) hide show
  1. MCP_Agent_1_7B_Training.ipynb +516 -0
MCP_Agent_1_7B_Training.ipynb ADDED
@@ -0,0 +1,516 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "nbformat": 4,
3
+ "nbformat_minor": 0,
4
+ "metadata": {
5
+ "colab": {
6
+ "provenance": [],
7
+ "gpuType": "T4"
8
+ },
9
+ "kernelspec": {
10
+ "name": "python3",
11
+ "display_name": "Python 3"
12
+ },
13
+ "language_info": {
14
+ "name": "python"
15
+ },
16
+ "accelerator": "GPU"
17
+ },
18
+ "cells": [
19
+ {
20
+ "cell_type": "markdown",
21
+ "metadata": {},
22
+ "source": [
23
+ "# πŸ€– MCP-Agent-1.7B β€” Training Notebook\n",
24
+ "\n",
25
+ "**What we're building:** The first open-source small language model that natively speaks the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/). It plans and executes multi-step tool chains with DAG dependencies.\n",
26
+ "\n",
27
+ "**Base model:** [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) (2B params, Apache 2.0)\n",
28
+ "\n",
29
+ "**Method:** LoRA SFT (rank=16, all linear layers)\n",
30
+ "\n",
31
+ "**Cost:** $0 (Google Colab free T4 GPU)\n",
32
+ "\n",
33
+ "**Time:** ~2 hours\n",
34
+ "\n",
35
+ "---\n",
36
+ "\n",
37
+ "## πŸŽ“ ML Concepts You'll Learn\n",
38
+ "1. **LoRA** β€” How to fine-tune a 2B model by only training 2% of parameters\n",
39
+ "2. **SFT** — Supervised Fine-Tuning: teaching a model with input→output examples\n",
40
+ "3. **bf16** β€” Half-precision training to cut memory usage in half\n",
41
+ "4. **Gradient Checkpointing** β€” Trading compute for memory\n",
42
+ "5. **Cosine LR Schedule** β€” Why we slow down learning over time\n",
43
+ "\n",
44
+ "---\n",
45
+ "\n",
46
+ "⚑ **Before you start:** Go to `Runtime β†’ Change runtime type β†’ T4 GPU`"
47
+ ]
48
+ },
49
+ {
50
+ "cell_type": "markdown",
51
+ "metadata": {},
52
+ "source": [
53
+ "## Step 0: Verify GPU & Install Dependencies\n",
54
+ "\n",
55
+ "πŸŽ“ **What's happening:** We check that Colab gave us a GPU, then install the ML libraries.\n",
56
+ "- `transformers` β€” HuggingFace's core library for loading/using AI models\n",
57
+ "- `trl` β€” Training library specifically for fine-tuning language models (SFT, RLHF, DPO)\n",
58
+ "- `peft` β€” Parameter-Efficient Fine-Tuning (LoRA lives here)\n",
59
+ "- `datasets` β€” For loading our training data from HuggingFace Hub\n",
60
+ "- `accelerate` β€” Makes training work on any hardware (CPU, GPU, multi-GPU)\n",
61
+ "- `bitsandbytes` β€” Memory-efficient optimizers and quantization"
62
+ ]
63
+ },
64
+ {
65
+ "cell_type": "code",
66
+ "execution_count": null,
67
+ "metadata": {},
68
+ "outputs": [],
69
+ "source": [
70
+ "# Check GPU β€” this MUST show \"Tesla T4\" or similar\n",
71
+ "!nvidia-smi\n",
72
+ "\n",
73
+ "import torch\n",
74
+ "print(f\"\\nβœ… PyTorch version: {torch.__version__}\")\n",
75
+ "print(f\"βœ… CUDA available: {torch.cuda.is_available()}\")\n",
76
+ "if torch.cuda.is_available():\n",
77
+ " print(f\"βœ… GPU: {torch.cuda.get_device_name(0)}\")\n",
78
+ " print(f\"βœ… VRAM: {torch.cuda.get_device_properties(0).total_mem / 1e9:.1f} GB\")\n",
79
+ "else:\n",
80
+ " raise RuntimeError(\"❌ No GPU! Go to Runtime β†’ Change runtime type β†’ T4 GPU\")"
81
+ ]
82
+ },
83
+ {
84
+ "cell_type": "code",
85
+ "execution_count": null,
86
+ "metadata": {},
87
+ "outputs": [],
88
+ "source": [
89
+ "# Install all dependencies (takes ~2-3 minutes)\n",
90
+ "!pip install -q transformers trl peft datasets accelerate bitsandbytes huggingface_hub\n",
91
+ "print(\"\\nβœ… All packages installed!\")"
92
+ ]
93
+ },
94
+ {
95
+ "cell_type": "markdown",
96
+ "metadata": {},
97
+ "source": [
98
+ "## Step 1: Login to HuggingFace\n",
99
+ "\n",
100
+ "πŸŽ“ **Why?** We need to:\n",
101
+ "1. Download Qwen3-1.7B from HuggingFace Hub\n",
102
+ "2. **Push our trained model** back to your HuggingFace account\n",
103
+ "\n",
104
+ "Get your token at: https://huggingface.co/settings/tokens (needs **Write** permission)"
105
+ ]
106
+ },
107
+ {
108
+ "cell_type": "code",
109
+ "execution_count": null,
110
+ "metadata": {},
111
+ "outputs": [],
112
+ "source": [
113
+ "from huggingface_hub import notebook_login\n",
114
+ "notebook_login() # Paste your HF token when prompted"
115
+ ]
116
+ },
117
+ {
118
+ "cell_type": "markdown",
119
+ "metadata": {},
120
+ "source": [
121
+ "## Step 2: Load Dataset\n",
122
+ "\n",
123
+ "πŸŽ“ **What's our data?** 16,520 conversations teaching the model to:\n",
124
+ "- Call tools using MCP protocol (JSON-RPC format)\n",
125
+ "- Plan multi-step tool chains with dependencies\n",
126
+ "- Ask clarifying questions when info is missing\n",
127
+ "- Refuse dangerous requests\n",
128
+ "\n",
129
+ "Each example is a conversation: `[{role: system, content: ...}, {role: user, content: ...}, {role: assistant, content: ...}]`\n",
130
+ "\n",
131
+ "The SFTTrainer automatically detects this `messages` format and applies the model's chat template."
132
+ ]
133
+ },
134
+ {
135
+ "cell_type": "code",
136
+ "execution_count": null,
137
+ "metadata": {},
138
+ "outputs": [],
139
+ "source": [
140
+ "from datasets import load_dataset\n",
141
+ "\n",
142
+ "dataset = load_dataset(\"muhammadtlha944/mcp-agent-training-data\")\n",
143
+ "\n",
144
+ "print(f\"πŸ“Š Train examples: {len(dataset['train']):,}\")\n",
145
+ "print(f\"πŸ“Š Validation examples: {len(dataset['validation']):,}\")\n",
146
+ "print(f\"πŸ“Š Columns: {dataset['train'].column_names}\")\n",
147
+ "\n",
148
+ "# Let's peek at one example\n",
149
+ "print(f\"\\nπŸ“ Sample conversation (first 2 messages):\")\n",
150
+ "sample = dataset['train'][0]['messages']\n",
151
+ "for msg in sample[:2]:\n",
152
+ " role = msg['role']\n",
153
+ " content = msg['content'][:200] + '...' if len(msg['content']) > 200 else msg['content']\n",
154
+ " print(f\" [{role}]: {content}\")"
155
+ ]
156
+ },
157
+ {
158
+ "cell_type": "markdown",
159
+ "metadata": {},
160
+ "source": [
161
+ "## Step 3: Configure LoRA\n",
162
+ "\n",
163
+ "πŸŽ“ **LoRA (Low-Rank Adaptation) β€” The Key Idea:**\n",
164
+ "\n",
165
+ "Instead of updating all 2 billion parameters (which would need ~16GB+ VRAM just for optimizer states), we add tiny trainable matrices to each layer.\n",
166
+ "\n",
167
+ "Think of it like this:\n",
168
+ "- **Full fine-tuning** = Rewriting an entire textbook (expensive, slow)\n",
169
+ "- **LoRA** = Adding sticky notes to key pages (cheap, fast, nearly as effective)\n",
170
+ "\n",
171
+ "**Parameters explained:**\n",
172
+ "- `r=16` β€” Rank of the adapter matrices. Like resolution: higher = more detail but more memory. 16 is the sweet spot for 16K examples.\n",
173
+ "- `lora_alpha=32` β€” Scaling factor (rule of thumb: 2Γ— rank). Controls how strongly LoRA affects output.\n",
174
+ "- `target_modules=\"all-linear\"` β€” Apply LoRA to ALL linear layers, not just attention. Research paper \"LoRA Without Regret\" proved this matches full fine-tuning quality.\n",
175
+ "- `lora_dropout=0.05` β€” 5% dropout prevents overfitting (randomly zeros out some adapter weights during training)."
176
+ ]
177
+ },
178
+ {
179
+ "cell_type": "code",
180
+ "execution_count": null,
181
+ "metadata": {},
182
+ "outputs": [],
183
+ "source": [
184
+ "from peft import LoraConfig\n",
185
+ "\n",
186
+ "peft_config = LoraConfig(\n",
187
+ " r=16, # Rank β€” 16 dimensions per adapter\n",
188
+ " lora_alpha=32, # Scaling factor β€” 2x rank\n",
189
+ " lora_dropout=0.05, # 5% dropout for regularization\n",
190
+ " bias=\"none\", # No bias terms β€” saves memory, no quality loss\n",
191
+ " task_type=\"CAUSAL_LM\", # This is a language model (predicts next token)\n",
192
+ " target_modules=\"all-linear\", # Apply to ALL linear layers\n",
193
+ ")\n",
194
+ "\n",
195
+ "print(\"βœ… LoRA config ready!\")\n",
196
+ "print(f\" Rank: {peft_config.r}\")\n",
197
+ "print(f\" Alpha: {peft_config.lora_alpha}\")\n",
198
+ "print(f\" Targets: {peft_config.target_modules}\")"
199
+ ]
200
+ },
201
+ {
202
+ "cell_type": "markdown",
203
+ "metadata": {},
204
+ "source": [
205
+ "## Step 4: Configure Training\n",
206
+ "\n",
207
+ "πŸŽ“ **Hyperparameters β€” The Recipe:**\n",
208
+ "\n",
209
+ "Training a model is like cooking. The hyperparameters are your recipe:\n",
210
+ "\n",
211
+ "| Parameter | Value | Why |\n",
212
+ "|-----------|-------|-----|\n",
213
+ "| **Learning rate** | 2e-4 | 10Γ— higher than full fine-tuning because LoRA updates fewer params β€” each update needs more impact |\n",
214
+ "| **Batch size** | 4 Γ— 4 = 16 effective | Process 4 examples at once, accumulate 4 times before updating weights |\n",
215
+ "| **Epochs** | 3 | See the data 3 times. 1 = underfitting, 10 = overfitting, 3 = sweet spot |\n",
216
+ "| **Warmup** | 10% of steps | Start with tiny learning rate, ramp up gradually. Prevents early instability |\n",
217
+ "| **LR schedule** | Cosine | Learning rate follows a cosine curve: high in middle, low at end. Helps convergence |\n",
218
+ "| **Max seq length** | 2048 tokens | Covers our examples while fitting in T4's 16GB VRAM |"
219
+ ]
220
+ },
221
+ {
222
+ "cell_type": "code",
223
+ "execution_count": null,
224
+ "metadata": {},
225
+ "outputs": [],
226
+ "source": [
227
+ "from trl import SFTConfig\n",
228
+ "\n",
229
+ "training_args = SFTConfig(\n",
230
+ " # === Output ===\n",
231
+ " output_dir=\"./mcp-agent-checkpoints\",\n",
232
+ "\n",
233
+ " # === Core hyperparameters ===\n",
234
+ " num_train_epochs=3,\n",
235
+ " per_device_train_batch_size=4, # 4 examples per GPU step\n",
236
+ " gradient_accumulation_steps=4, # Accumulate 4 steps β†’ effective batch = 16\n",
237
+ " learning_rate=2e-4, # 10x base LR for LoRA\n",
238
+ " weight_decay=0.01, # L2 regularization\n",
239
+ " lr_scheduler_type=\"cosine\", # Cosine decay\n",
240
+ " warmup_ratio=0.1, # 10% warmup\n",
241
+ " max_grad_norm=1.0, # Gradient clipping\n",
242
+ " max_seq_length=2048, # Max tokens per example\n",
243
+ "\n",
244
+ " # === Memory optimization (critical for T4 16GB!) ===\n",
245
+ " bf16=False, # T4 doesn't support bf16 well\n",
246
+ " fp16=True, # Use fp16 instead β€” T4 is great at this\n",
247
+ " gradient_checkpointing=True, # Trade compute for memory\n",
248
+ " gradient_checkpointing_kwargs={\"use_reentrant\": False},\n",
249
+ "\n",
250
+ " # === Logging ===\n",
251
+ " logging_steps=10,\n",
252
+ " logging_first_step=True,\n",
253
+ " logging_strategy=\"steps\",\n",
254
+ "\n",
255
+ " # === Evaluation ===\n",
256
+ " eval_strategy=\"steps\",\n",
257
+ " eval_steps=200,\n",
258
+ " per_device_eval_batch_size=4,\n",
259
+ "\n",
260
+ " # === Checkpointing ===\n",
261
+ " save_strategy=\"steps\",\n",
262
+ " save_steps=200,\n",
263
+ " save_total_limit=2, # Keep 2 checkpoints (save disk space)\n",
264
+ " load_best_model_at_end=True,\n",
265
+ " metric_for_best_model=\"eval_loss\",\n",
266
+ "\n",
267
+ " # === Push to HuggingFace Hub ===\n",
268
+ " push_to_hub=True,\n",
269
+ " hub_model_id=\"muhammadtlha944/MCP-Agent-1.7B\",\n",
270
+ " hub_strategy=\"end\",\n",
271
+ "\n",
272
+ " # === Misc ===\n",
273
+ " seed=42,\n",
274
+ " dataloader_num_workers=2,\n",
275
+ " optim=\"adamw_torch\",\n",
276
+ ")\n",
277
+ "\n",
278
+ "# Print training stats\n",
279
+ "steps_per_epoch = len(dataset['train']) // (4 * 4) # train_size // effective_batch\n",
280
+ "total_steps = steps_per_epoch * 3\n",
281
+ "print(f\"βœ… Training config ready!\")\n",
282
+ "print(f\" Effective batch size: 16\")\n",
283
+ "print(f\" Steps per epoch: {steps_per_epoch}\")\n",
284
+ "print(f\" Total steps: {total_steps}\")\n",
285
+ "print(f\" Warmup steps: {int(total_steps * 0.1)}\")\n",
286
+ "print(f\" Estimated time: ~2 hours on T4\")"
287
+ ]
288
+ },
289
+ {
290
+ "cell_type": "markdown",
291
+ "metadata": {},
292
+ "source": [
293
+ "## Step 5: Load Tokenizer\n",
294
+ "\n",
295
+ "πŸŽ“ **Tokenizer β€” Translating Words to Numbers:**\n",
296
+ "\n",
297
+ "AI models don't understand text β€” they work with numbers. The tokenizer converts:\n",
298
+ "- `\"Hello world\"` β†’ `[9707, 1879]` (encoding)\n",
299
+ "- `[9707, 1879]` β†’ `\"Hello world\"` (decoding)\n",
300
+ "\n",
301
+ "Qwen3 uses a **chat template** that wraps conversations in special tokens like `<|im_start|>user` and `<|im_end|>`. The SFTTrainer applies this automatically to our `messages` data."
302
+ ]
303
+ },
304
+ {
305
+ "cell_type": "code",
306
+ "execution_count": null,
307
+ "metadata": {},
308
+ "outputs": [],
309
+ "source": [
310
+ "from transformers import AutoTokenizer\n",
311
+ "\n",
312
+ "tokenizer = AutoTokenizer.from_pretrained(\n",
313
+ " \"Qwen/Qwen3-1.7B\",\n",
314
+ " trust_remote_code=True,\n",
315
+ ")\n",
316
+ "\n",
317
+ "print(f\"βœ… Tokenizer loaded!\")\n",
318
+ "print(f\" Vocab size: {tokenizer.vocab_size:,}\")\n",
319
+ "\n",
320
+ "# Demo: see how tokenization works\n",
321
+ "demo_text = \"Call the GitHub search tool\"\n",
322
+ "tokens = tokenizer.encode(demo_text)\n",
323
+ "print(f\"\\nπŸ“ Demo: '{demo_text}'\")\n",
324
+ "print(f\" β†’ Token IDs: {tokens}\")\n",
325
+ "print(f\" β†’ Tokens: {[tokenizer.decode([t]) for t in tokens]}\")\n",
326
+ "print(f\" β†’ {len(tokens)} tokens\")"
327
+ ]
328
+ },
329
+ {
330
+ "cell_type": "markdown",
331
+ "metadata": {},
332
+ "source": [
333
+ "## Step 6: Create Trainer & Start Training! πŸš€\n",
334
+ "\n",
335
+ "πŸŽ“ **SFTTrainer does everything:**\n",
336
+ "1. Loads the 2B parameter model onto the GPU\n",
337
+ "2. Injects LoRA adapters into all linear layers (~40M trainable params out of 2B)\n",
338
+ "3. Tokenizes all conversations using the chat template\n",
339
+ "4. Runs the training loop for 3 epochs\n",
340
+ "5. Evaluates on validation set every 200 steps\n",
341
+ "6. Saves checkpoints and picks the best one\n",
342
+ "7. Pushes the final model to HuggingFace Hub\n",
343
+ "\n",
344
+ "**What to watch:** The `loss` value should go DOWN over time. This means the model is learning. If loss goes up after going down, that's overfitting (the model is memorizing instead of learning).\n",
345
+ "\n",
346
+ "⏱️ **This cell takes ~2 hours. Don't close the tab!**"
347
+ ]
348
+ },
349
+ {
350
+ "cell_type": "code",
351
+ "execution_count": null,
352
+ "metadata": {},
353
+ "outputs": [],
354
+ "source": [
355
+ "from trl import SFTTrainer\n",
356
+ "\n",
357
+ "print(\"πŸ”§ Loading model and applying LoRA adapters...\")\n",
358
+ "print(\" (This takes 2-3 minutes β€” downloading 2B parameters)\\n\")\n",
359
+ "\n",
360
+ "trainer = SFTTrainer(\n",
361
+ " model=\"Qwen/Qwen3-1.7B\",\n",
362
+ " args=training_args,\n",
363
+ " train_dataset=dataset[\"train\"],\n",
364
+ " eval_dataset=dataset[\"validation\"],\n",
365
+ " peft_config=peft_config,\n",
366
+ " processing_class=tokenizer,\n",
367
+ ")\n",
368
+ "\n",
369
+ "# Print parameter stats\n",
370
+ "trainable = sum(p.numel() for p in trainer.model.parameters() if p.requires_grad)\n",
371
+ "total = sum(p.numel() for p in trainer.model.parameters())\n",
372
+ "print(f\"\\nπŸ“Š Model loaded!\")\n",
373
+ "print(f\" Total parameters: {total:,}\")\n",
374
+ "print(f\" Trainable (LoRA): {trainable:,}\")\n",
375
+ "print(f\" Trainable %: {100 * trainable / total:.2f}%\")\n",
376
+ "print(f\" GPU memory used: {torch.cuda.memory_allocated() / 1e9:.1f} GB\")\n",
377
+ "print(f\"\\nπŸš€ Starting training...\\n\")\n",
378
+ "\n",
379
+ "# TRAIN!\n",
380
+ "train_result = trainer.train()\n",
381
+ "\n",
382
+ "print(f\"\\nβœ… Training complete!\")\n",
383
+ "print(f\" Final loss: {train_result.metrics.get('train_loss', 'N/A')}\")\n",
384
+ "print(f\" Runtime: {train_result.metrics.get('train_runtime', 0)/3600:.1f} hours\")"
385
+ ]
386
+ },
387
+ {
388
+ "cell_type": "markdown",
389
+ "metadata": {},
390
+ "source": [
391
+ "## Step 7: Evaluate & Push to Hub\n",
392
+ "\n",
393
+ "πŸŽ“ **Evaluation:** We run the model on the validation set (826 examples it has NEVER seen during training) to measure real performance. If eval loss is close to train loss = good generalization. If eval loss >> train loss = overfitting."
394
+ ]
395
+ },
396
+ {
397
+ "cell_type": "code",
398
+ "execution_count": null,
399
+ "metadata": {},
400
+ "outputs": [],
401
+ "source": [
402
+ "# Final evaluation\n",
403
+ "print(\"πŸ“Š Running final evaluation...\")\n",
404
+ "eval_metrics = trainer.evaluate()\n",
405
+ "print(f\" Eval loss: {eval_metrics['eval_loss']:.4f}\")\n",
406
+ "\n",
407
+ "# Save metrics\n",
408
+ "trainer.log_metrics(\"train\", train_result.metrics)\n",
409
+ "trainer.save_metrics(\"train\", train_result.metrics)\n",
410
+ "trainer.log_metrics(\"eval\", eval_metrics)\n",
411
+ "trainer.save_metrics(\"eval\", eval_metrics)\n",
412
+ "\n",
413
+ "# Push to HuggingFace Hub\n",
414
+ "print(\"\\nπŸš€ Pushing model to HuggingFace Hub...\")\n",
415
+ "trainer.push_to_hub(\n",
416
+ " commit_message=\"MCP-Agent-1.7B: LoRA fine-tuned Qwen3-1.7B for MCP tool calling\",\n",
417
+ " tags=[\"mcp\", \"tool-calling\", \"function-calling\", \"agent\", \"qwen3\", \"lora\"],\n",
418
+ ")\n",
419
+ "\n",
420
+ "print(f\"\\n\" + \"=\"*60)\n",
421
+ "print(f\"πŸŽ‰ MCP-Agent-1.7B is LIVE!\")\n",
422
+ "print(f\"=\"*60)\n",
423
+ "print(f\"πŸ“¦ Model: https://huggingface.co/muhammadtlha944/MCP-Agent-1.7B\")\n",
424
+ "print(f\"πŸ“Š Train loss: {train_result.metrics.get('train_loss', 'N/A'):.4f}\")\n",
425
+ "print(f\"πŸ“Š Eval loss: {eval_metrics['eval_loss']:.4f}\")\n",
426
+ "print(f\"⏱️ Training time: {train_result.metrics.get('train_runtime', 0)/3600:.1f} hours\")"
427
+ ]
428
+ },
429
+ {
430
+ "cell_type": "markdown",
431
+ "metadata": {},
432
+ "source": [
433
+ "## Step 8: Test Your Model! πŸ§ͺ\n",
434
+ "\n",
435
+ "Let's see MCP-Agent-1.7B in action β€” give it a request and watch it plan tool calls!"
436
+ ]
437
+ },
438
+ {
439
+ "cell_type": "code",
440
+ "execution_count": null,
441
+ "metadata": {},
442
+ "outputs": [],
443
+ "source": [
444
+ "# Quick test β€” see the model generate MCP tool calls\n",
445
+ "from transformers import pipeline\n",
446
+ "\n",
447
+ "print(\"πŸ§ͺ Testing MCP-Agent-1.7B...\\n\")\n",
448
+ "\n",
449
+ "pipe = pipeline(\n",
450
+ " \"text-generation\",\n",
451
+ " model=trainer.model,\n",
452
+ " tokenizer=tokenizer,\n",
453
+ " max_new_tokens=512,\n",
454
+ " do_sample=True,\n",
455
+ " temperature=0.7,\n",
456
+ ")\n",
457
+ "\n",
458
+ "test_prompts = [\n",
459
+ " # Test 1: Simple tool call\n",
460
+ " {\n",
461
+ " \"messages\": [\n",
462
+ " {\"role\": \"system\", \"content\": \"You are an MCP agent with access to tools: github_search, read_file, shell_exec. Use JSON-RPC format for tool calls.\"},\n",
463
+ " {\"role\": \"user\", \"content\": \"Find all Python files in the src/ directory that import pandas\"}\n",
464
+ " ]\n",
465
+ " },\n",
466
+ " # Test 2: Multi-step planning\n",
467
+ " {\n",
468
+ " \"messages\": [\n",
469
+ " {\"role\": \"system\", \"content\": \"You are an MCP agent with access to tools: github_search, read_file, shell_exec, sqlite_query. Plan multi-step tool chains when needed.\"},\n",
470
+ " {\"role\": \"user\", \"content\": \"Clone the repo https://github.com/example/app, find all TODO comments, and create a summary report\"}\n",
471
+ " ]\n",
472
+ " },\n",
473
+ " # Test 3: Clarification (should ask for missing info)\n",
474
+ " {\n",
475
+ " \"messages\": [\n",
476
+ " {\"role\": \"system\", \"content\": \"You are an MCP agent. Ask for clarification when the request is ambiguous or missing critical information.\"},\n",
477
+ " {\"role\": \"user\", \"content\": \"Delete the database\"}\n",
478
+ " ]\n",
479
+ " },\n",
480
+ "]\n",
481
+ "\n",
482
+ "for i, prompt in enumerate(test_prompts, 1):\n",
483
+ " print(f\"{'='*60}\")\n",
484
+ " print(f\"TEST {i}: {prompt['messages'][-1]['content']}\")\n",
485
+ " print(f\"{'='*60}\")\n",
486
+ " result = pipe(prompt['messages'])\n",
487
+ " assistant_msg = result[0]['generated_text'][-1]['content']\n",
488
+ " print(f\"\\nπŸ€– MCP-Agent Response:\\n{assistant_msg}\\n\")"
489
+ ]
490
+ },
491
+ {
492
+ "cell_type": "markdown",
493
+ "metadata": {},
494
+ "source": [
495
+ "## πŸŽ‰ Congratulations!\n",
496
+ "\n",
497
+ "You just trained an AI model! Here's what you accomplished:\n",
498
+ "\n",
499
+ "- βœ… Fine-tuned a 2 billion parameter model using LoRA\n",
500
+ "- βœ… Trained on 16,520 MCP tool-calling examples\n",
501
+ "- βœ… Published your model to HuggingFace Hub\n",
502
+ "- βœ… Tested it on real MCP scenarios\n",
503
+ "\n",
504
+ "**Your model:** [muhammadtlha944/MCP-Agent-1.7B](https://huggingface.co/muhammadtlha944/MCP-Agent-1.7B)\n",
505
+ "\n",
506
+ "**Next steps:**\n",
507
+ "1. Try more test prompts above\n",
508
+ "2. Share on X/Twitter with #MCP-Agent\n",
509
+ "3. Build a Gradio demo for interactive testing\n",
510
+ "\n",
511
+ "---\n",
512
+ "*Built by Muhammad Talha β€” Learning ML by building real projects*"
513
+ ]
514
+ }
515
+ ]
516
+ }