AgentIntentRouter
A fast, lightweight intent classifier for AI agent and MCP tool routing. Given a user message, it predicts which tool or capability the agent should invoke โ in under 50ms on CPU.
Built on DistilBERT (66M params), fine-tuned on 12K+ diverse examples across 8 intent categories.
Why This Exists
Every agent framework (LangChain, LangGraph, CrewAI, AutoGen) wastes an entire LLM call just to figure out what the user wants. That's 1-3 seconds and ~$0.01 per request โ just for routing.
AgentIntentRouter replaces that first LLM call with a 66M classifier that runs in ~10ms on CPU and ~2ms on GPU. Use it as the first step in your agent pipeline to instantly route to the right tool.
Intent Categories
| Label | Description | Example |
|---|---|---|
code_generation |
User wants code written, debugged, or refactored | "Write a Python function to parse CSV" |
web_search |
User wants to find information online | "What's the latest news on AI regulation" |
math_calculation |
User needs computation or conversion | "Calculate 15% of 4500" |
file_operation |
User wants to read, write, or manage files | "Read the config.json file" |
api_call |
User wants to interact with an external API | "Send a Slack message to the team" |
creative_writing |
User wants text composed or drafted | "Write a professional email to the client" |
data_analysis |
User wants data interpreted or compared | "Compare React vs Vue performance" |
general_chat |
Casual conversation, greetings, feedback | "Hey, how are you?" |
Quick Start
from transformers import pipeline
router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")
# Single prediction
result = router("Write a Python function to sort a list")
print(result)
# [{'label': 'code_generation', 'score': 0.98}]
# Batch prediction
messages = [
"Search for the latest AI papers",
"What's 25% of 1200?",
"Draft an email to my boss about the deadline",
"Hello!",
]
results = router(messages)
for msg, res in zip(messages, results):
print(f" {res['label']:>20} ({res['score']:.2f}) โ {msg}")
Use as Agent Router
from transformers import pipeline
router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")
TOOL_MAP = {
"code_generation": handle_code_request,
"web_search": handle_search,
"math_calculation": handle_calculation,
"file_operation": handle_file_ops,
"api_call": handle_api_call,
"creative_writing": handle_writing,
"data_analysis": handle_analysis,
"general_chat": handle_chat,
}
def route(user_message: str):
intent = router(user_message)[0]
if intent["score"] < 0.5:
# Low confidence โ fall back to LLM for routing
return fallback_llm_route(user_message)
handler = TOOL_MAP[intent["label"]]
return handler(user_message)
Performance
- Inference speed: ~10ms on CPU, ~2ms on GPU
- Model size: ~260MB (DistilBERT-base)
- Accuracy: 100% on test set
Evaluation Results
Results on held-out test set (1,124 examples):
| Metric | Score |
|---|---|
| Accuracy | 1.000 |
| F1 (weighted) | 1.000 |
Per-class performance:
| Intent | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| code_generation | 1.000 | 1.000 | 1.000 | 130 |
| web_search | 1.000 | 1.000 | 1.000 | 151 |
| math_calculation | 1.000 | 1.000 | 1.000 | 153 |
| file_operation | 1.000 | 1.000 | 1.000 | 154 |
| api_call | 1.000 | 1.000 | 1.000 | 133 |
| creative_writing | 1.000 | 1.000 | 1.000 | 160 |
| data_analysis | 1.000 | 1.000 | 1.000 | 168 |
| general_chat | 1.000 | 1.000 | 1.000 | 75 |
Note: These results are on synthetic test data from the same distribution as training. Real-world performance will vary. Use the confidence score threshold to handle ambiguous inputs gracefully.
Training Details
- Base model: distilbert-base-uncased
- Training data: 8,987 examples (synthetic, template-generated with natural language variation)
- Validation: 1,123 examples
- Test: 1,124 examples
- Epochs: 3 (with early stopping, patience=2)
- Learning rate: 2e-5
- Batch size: 32
- Max sequence length: 128
- Training time: ~100 seconds on NVIDIA RTX 4070
- Loss: 0.0015 (training) / 0.0017 (validation)
Limitations
- Trained on English text only
- Template-generated training data may not cover all edge cases
- Ambiguous messages (e.g., "help me with the API code") may get lower confidence scores โ use the confidence threshold to fall back to an LLM
- Not designed for multi-intent messages (e.g., "search for X and write code for Y")
License
Apache 2.0 โ use it however you want, commercial included.
Citation
If you use this model, a star on the repo is appreciated!
- Downloads last month
- 13
Model tree for tripathyShaswata/AgentIntentRouter
Base model
distilbert/distilbert-base-uncased