--- license: apache-2.0 base_model: distilbert-base-uncased tags: - text-classification - intent-detection - agent-routing - mcp - ai-agents - distilbert - tool-use datasets: - custom language: - en metrics: - accuracy - f1 pipeline_tag: text-classification library_name: transformers --- # AgentIntentRouter A fast, lightweight intent classifier for AI agent and MCP tool routing. Given a user message, it predicts which tool or capability the agent should invoke — in under 50ms on CPU. Built on DistilBERT (66M params), fine-tuned on 12K+ diverse examples across 8 intent categories. ## Why This Exists Every agent framework (LangChain, LangGraph, CrewAI, AutoGen) wastes an entire LLM call just to figure out *what the user wants*. That's 1-3 seconds and ~$0.01 per request — just for routing. AgentIntentRouter replaces that first LLM call with a 66M classifier that runs in **~10ms on CPU** and **~2ms on GPU**. Use it as the first step in your agent pipeline to instantly route to the right tool. ## Intent Categories | Label | Description | Example | |-------|-------------|---------| | `code_generation` | User wants code written, debugged, or refactored | "Write a Python function to parse CSV" | | `web_search` | User wants to find information online | "What's the latest news on AI regulation" | | `math_calculation` | User needs computation or conversion | "Calculate 15% of 4500" | | `file_operation` | User wants to read, write, or manage files | "Read the config.json file" | | `api_call` | User wants to interact with an external API | "Send a Slack message to the team" | | `creative_writing` | User wants text composed or drafted | "Write a professional email to the client" | | `data_analysis` | User wants data interpreted or compared | "Compare React vs Vue performance" | | `general_chat` | Casual conversation, greetings, feedback | "Hey, how are you?" | ## Quick Start ```python from transformers import pipeline router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter") # Single prediction result = router("Write a Python function to sort a list") print(result) # [{'label': 'code_generation', 'score': 0.98}] # Batch prediction messages = [ "Search for the latest AI papers", "What's 25% of 1200?", "Draft an email to my boss about the deadline", "Hello!", ] results = router(messages) for msg, res in zip(messages, results): print(f" {res['label']:>20} ({res['score']:.2f}) — {msg}") ``` ## Use as Agent Router ```python from transformers import pipeline router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter") TOOL_MAP = { "code_generation": handle_code_request, "web_search": handle_search, "math_calculation": handle_calculation, "file_operation": handle_file_ops, "api_call": handle_api_call, "creative_writing": handle_writing, "data_analysis": handle_analysis, "general_chat": handle_chat, } def route(user_message: str): intent = router(user_message)[0] if intent["score"] < 0.5: # Low confidence — fall back to LLM for routing return fallback_llm_route(user_message) handler = TOOL_MAP[intent["label"]] return handler(user_message) ``` ## Performance - **Inference speed:** ~10ms on CPU, ~2ms on GPU - **Model size:** ~260MB (DistilBERT-base) - **Accuracy:** 100% on test set ### Evaluation Results *Results on held-out test set (1,124 examples):* | Metric | Score | |--------|-------| | Accuracy | 1.000 | | F1 (weighted) | 1.000 | *Per-class performance:* | Intent | Precision | Recall | F1 | Support | |--------|-----------|--------|-----|---------| | code_generation | 1.000 | 1.000 | 1.000 | 130 | | web_search | 1.000 | 1.000 | 1.000 | 151 | | math_calculation | 1.000 | 1.000 | 1.000 | 153 | | file_operation | 1.000 | 1.000 | 1.000 | 154 | | api_call | 1.000 | 1.000 | 1.000 | 133 | | creative_writing | 1.000 | 1.000 | 1.000 | 160 | | data_analysis | 1.000 | 1.000 | 1.000 | 168 | | general_chat | 1.000 | 1.000 | 1.000 | 75 | > **Note:** These results are on synthetic test data from the same distribution as training. Real-world performance will vary. Use the confidence score threshold to handle ambiguous inputs gracefully. ## Training Details - **Base model:** distilbert-base-uncased - **Training data:** 8,987 examples (synthetic, template-generated with natural language variation) - **Validation:** 1,123 examples - **Test:** 1,124 examples - **Epochs:** 3 (with early stopping, patience=2) - **Learning rate:** 2e-5 - **Batch size:** 32 - **Max sequence length:** 128 - **Training time:** ~100 seconds on NVIDIA RTX 4070 - **Loss:** 0.0015 (training) / 0.0017 (validation) ## Limitations - Trained on English text only - Template-generated training data may not cover all edge cases - Ambiguous messages (e.g., "help me with the API code") may get lower confidence scores — use the confidence threshold to fall back to an LLM - Not designed for multi-intent messages (e.g., "search for X and write code for Y") ## License Apache 2.0 — use it however you want, commercial included. ## Citation If you use this model, a star on the repo is appreciated!