AgentIntentRouter

A fast, lightweight intent classifier for AI agent and MCP tool routing. Given a user message, it predicts which tool or capability the agent should invoke โ€” in under 50ms on CPU.

Built on DistilBERT (66M params), fine-tuned on 12K+ diverse examples across 8 intent categories.

Why This Exists

Every agent framework (LangChain, LangGraph, CrewAI, AutoGen) wastes an entire LLM call just to figure out what the user wants. That's 1-3 seconds and ~$0.01 per request โ€” just for routing.

AgentIntentRouter replaces that first LLM call with a 66M classifier that runs in ~10ms on CPU and ~2ms on GPU. Use it as the first step in your agent pipeline to instantly route to the right tool.

Intent Categories

Label Description Example
code_generation User wants code written, debugged, or refactored "Write a Python function to parse CSV"
web_search User wants to find information online "What's the latest news on AI regulation"
math_calculation User needs computation or conversion "Calculate 15% of 4500"
file_operation User wants to read, write, or manage files "Read the config.json file"
api_call User wants to interact with an external API "Send a Slack message to the team"
creative_writing User wants text composed or drafted "Write a professional email to the client"
data_analysis User wants data interpreted or compared "Compare React vs Vue performance"
general_chat Casual conversation, greetings, feedback "Hey, how are you?"

Quick Start

from transformers import pipeline

router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")

# Single prediction
result = router("Write a Python function to sort a list")
print(result)
# [{'label': 'code_generation', 'score': 0.98}]

# Batch prediction
messages = [
    "Search for the latest AI papers",
    "What's 25% of 1200?",
    "Draft an email to my boss about the deadline",
    "Hello!",
]
results = router(messages)
for msg, res in zip(messages, results):
    print(f"  {res['label']:>20} ({res['score']:.2f}) โ€” {msg}")

Use as Agent Router

from transformers import pipeline

router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")

TOOL_MAP = {
    "code_generation": handle_code_request,
    "web_search": handle_search,
    "math_calculation": handle_calculation,
    "file_operation": handle_file_ops,
    "api_call": handle_api_call,
    "creative_writing": handle_writing,
    "data_analysis": handle_analysis,
    "general_chat": handle_chat,
}

def route(user_message: str):
    intent = router(user_message)[0]
    
    if intent["score"] < 0.5:
        # Low confidence โ€” fall back to LLM for routing
        return fallback_llm_route(user_message)
    
    handler = TOOL_MAP[intent["label"]]
    return handler(user_message)

Performance

  • Inference speed: ~10ms on CPU, ~2ms on GPU
  • Model size: ~260MB (DistilBERT-base)
  • Accuracy: 100% on test set

Evaluation Results

Results on held-out test set (1,124 examples):

Metric Score
Accuracy 1.000
F1 (weighted) 1.000

Per-class performance:

Intent Precision Recall F1 Support
code_generation 1.000 1.000 1.000 130
web_search 1.000 1.000 1.000 151
math_calculation 1.000 1.000 1.000 153
file_operation 1.000 1.000 1.000 154
api_call 1.000 1.000 1.000 133
creative_writing 1.000 1.000 1.000 160
data_analysis 1.000 1.000 1.000 168
general_chat 1.000 1.000 1.000 75

Note: These results are on synthetic test data from the same distribution as training. Real-world performance will vary. Use the confidence score threshold to handle ambiguous inputs gracefully.

Training Details

  • Base model: distilbert-base-uncased
  • Training data: 8,987 examples (synthetic, template-generated with natural language variation)
  • Validation: 1,123 examples
  • Test: 1,124 examples
  • Epochs: 3 (with early stopping, patience=2)
  • Learning rate: 2e-5
  • Batch size: 32
  • Max sequence length: 128
  • Training time: ~100 seconds on NVIDIA RTX 4070
  • Loss: 0.0015 (training) / 0.0017 (validation)

Limitations

  • Trained on English text only
  • Template-generated training data may not cover all edge cases
  • Ambiguous messages (e.g., "help me with the API code") may get lower confidence scores โ€” use the confidence threshold to fall back to an LLM
  • Not designed for multi-intent messages (e.g., "search for X and write code for Y")

License

Apache 2.0 โ€” use it however you want, commercial included.

Citation

If you use this model, a star on the repo is appreciated!

Downloads last month
13
Safetensors
Model size
67M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tripathyShaswata/AgentIntentRouter

Finetuned
(11281)
this model