AgentIntentRouter

A fast, lightweight intent classifier for AI agent and MCP tool routing. Given a user message, it predicts which tool or capability the agent should invoke — in under 50ms on CPU.

Built on DistilBERT (66M params), fine-tuned on 12K+ diverse examples across 8 intent categories.

Why This Exists

Every agent framework (LangChain, LangGraph, CrewAI, AutoGen) wastes an entire LLM call just to figure out what the user wants. That's 1-3 seconds and ~$0.01 per request — just for routing.

AgentIntentRouter replaces that first LLM call with a 66M classifier that runs in ~10ms on CPU and ~2ms on GPU. Use it as the first step in your agent pipeline to instantly route to the right tool.

Intent Categories

Label	Description	Example
`code_generation`	User wants code written, debugged, or refactored	"Write a Python function to parse CSV"
`web_search`	User wants to find information online	"What's the latest news on AI regulation"
`math_calculation`	User needs computation or conversion	"Calculate 15% of 4500"
`file_operation`	User wants to read, write, or manage files	"Read the config.json file"
`api_call`	User wants to interact with an external API	"Send a Slack message to the team"
`creative_writing`	User wants text composed or drafted	"Write a professional email to the client"
`data_analysis`	User wants data interpreted or compared	"Compare React vs Vue performance"
`general_chat`	Casual conversation, greetings, feedback	"Hey, how are you?"

Quick Start

from transformers import pipeline

router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")

# Single prediction
result = router("Write a Python function to sort a list")
print(result)
# [{'label': 'code_generation', 'score': 0.98}]

# Batch prediction
messages = [
    "Search for the latest AI papers",
    "What's 25% of 1200?",
    "Draft an email to my boss about the deadline",
    "Hello!",
]
results = router(messages)
for msg, res in zip(messages, results):
    print(f"  {res['label']:>20} ({res['score']:.2f}) — {msg}")

Use as Agent Router

from transformers import pipeline

router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")

TOOL_MAP = {
    "code_generation": handle_code_request,
    "web_search": handle_search,
    "math_calculation": handle_calculation,
    "file_operation": handle_file_ops,
    "api_call": handle_api_call,
    "creative_writing": handle_writing,
    "data_analysis": handle_analysis,
    "general_chat": handle_chat,
}

def route(user_message: str):
    intent = router(user_message)[0]
    
    if intent["score"] < 0.5:
        # Low confidence — fall back to LLM for routing
        return fallback_llm_route(user_message)
    
    handler = TOOL_MAP[intent["label"]]
    return handler(user_message)

Performance

Inference speed: ~10ms on CPU, ~2ms on GPU
Model size: ~260MB (DistilBERT-base)
Accuracy: 100% on test set

Evaluation Results

Results on held-out test set (1,124 examples):

Metric	Score
Accuracy	1.000
F1 (weighted)	1.000

Per-class performance:

Intent	Precision	Recall	F1	Support
code_generation	1.000	1.000	1.000	130
web_search	1.000	1.000	1.000	151
math_calculation	1.000	1.000	1.000	153
file_operation	1.000	1.000	1.000	154
api_call	1.000	1.000	1.000	133
creative_writing	1.000	1.000	1.000	160
data_analysis	1.000	1.000	1.000	168
general_chat	1.000	1.000	1.000	75

Note: These results are on synthetic test data from the same distribution as training. Real-world performance will vary. Use the confidence score threshold to handle ambiguous inputs gracefully.

Training Details

Base model: distilbert-base-uncased
Training data: 8,987 examples (synthetic, template-generated with natural language variation)
Validation: 1,123 examples
Test: 1,124 examples
Epochs: 3 (with early stopping, patience=2)
Learning rate: 2e-5
Batch size: 32
Max sequence length: 128
Training time: ~100 seconds on NVIDIA RTX 4070
Loss: 0.0015 (training) / 0.0017 (validation)

Limitations

Trained on English text only
Template-generated training data may not cover all edge cases
Ambiguous messages (e.g., "help me with the API code") may get lower confidence scores — use the confidence threshold to fall back to an LLM
Not designed for multi-intent messages (e.g., "search for X and write code for Y")

License

Apache 2.0 — use it however you want, commercial included.

Citation

If you use this model, a star on the repo is appreciated!

Downloads last month: 13

Safetensors

Model size

67M params

Tensor type

F32

Model tree for tripathyShaswata/AgentIntentRouter

Base model

distilbert/distilbert-base-uncased

Finetuned

(11281)

this model