---
license: apache-2.0
base_model: distilbert-base-uncased
tags:
  - text-classification
  - intent-detection
  - agent-routing
  - mcp
  - ai-agents
  - distilbert
  - tool-use
datasets:
  - custom
language:
  - en
metrics:
  - accuracy
  - f1
pipeline_tag: text-classification
library_name: transformers
---

# AgentIntentRouter

A fast, lightweight intent classifier for AI agent and MCP tool routing. Given a user message, it predicts which tool or capability the agent should invoke — in under 50ms on CPU.

Built on DistilBERT (66M params), fine-tuned on 12K+ diverse examples across 8 intent categories.

## Why This Exists

Every agent framework (LangChain, LangGraph, CrewAI, AutoGen) wastes an entire LLM call just to figure out *what the user wants*. That's 1-3 seconds and ~$0.01 per request — just for routing.

AgentIntentRouter replaces that first LLM call with a 66M classifier that runs in **~10ms on CPU** and **~2ms on GPU**. Use it as the first step in your agent pipeline to instantly route to the right tool.

## Intent Categories

| Label | Description | Example |
|-------|-------------|---------|
| `code_generation` | User wants code written, debugged, or refactored | "Write a Python function to parse CSV" |
| `web_search` | User wants to find information online | "What's the latest news on AI regulation" |
| `math_calculation` | User needs computation or conversion | "Calculate 15% of 4500" |
| `file_operation` | User wants to read, write, or manage files | "Read the config.json file" |
| `api_call` | User wants to interact with an external API | "Send a Slack message to the team" |
| `creative_writing` | User wants text composed or drafted | "Write a professional email to the client" |
| `data_analysis` | User wants data interpreted or compared | "Compare React vs Vue performance" |
| `general_chat` | Casual conversation, greetings, feedback | "Hey, how are you?" |

## Quick Start

```python
from transformers import pipeline

router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")

# Single prediction
result = router("Write a Python function to sort a list")
print(result)
# [{'label': 'code_generation', 'score': 0.98}]

# Batch prediction
messages = [
    "Search for the latest AI papers",
    "What's 25% of 1200?",
    "Draft an email to my boss about the deadline",
    "Hello!",
]
results = router(messages)
for msg, res in zip(messages, results):
    print(f"  {res['label']:>20} ({res['score']:.2f}) — {msg}")
```

## Use as Agent Router

```python
from transformers import pipeline

router = pipeline("text-classification", model="tripathyShaswata/AgentIntentRouter")

TOOL_MAP = {
    "code_generation": handle_code_request,
    "web_search": handle_search,
    "math_calculation": handle_calculation,
    "file_operation": handle_file_ops,
    "api_call": handle_api_call,
    "creative_writing": handle_writing,
    "data_analysis": handle_analysis,
    "general_chat": handle_chat,
}

def route(user_message: str):
    intent = router(user_message)[0]
    
    if intent["score"] < 0.5:
        # Low confidence — fall back to LLM for routing
        return fallback_llm_route(user_message)
    
    handler = TOOL_MAP[intent["label"]]
    return handler(user_message)
```

## Performance

- **Inference speed:** ~10ms on CPU, ~2ms on GPU
- **Model size:** ~260MB (DistilBERT-base)
- **Accuracy:** 100% on test set

### Evaluation Results

*Results on held-out test set (1,124 examples):*

| Metric | Score |
|--------|-------|
| Accuracy | 1.000 |
| F1 (weighted) | 1.000 |

*Per-class performance:*

| Intent | Precision | Recall | F1 | Support |
|--------|-----------|--------|-----|---------|
| code_generation | 1.000 | 1.000 | 1.000 | 130 |
| web_search | 1.000 | 1.000 | 1.000 | 151 |
| math_calculation | 1.000 | 1.000 | 1.000 | 153 |
| file_operation | 1.000 | 1.000 | 1.000 | 154 |
| api_call | 1.000 | 1.000 | 1.000 | 133 |
| creative_writing | 1.000 | 1.000 | 1.000 | 160 |
| data_analysis | 1.000 | 1.000 | 1.000 | 168 |
| general_chat | 1.000 | 1.000 | 1.000 | 75 |

> **Note:** These results are on synthetic test data from the same distribution as training. Real-world performance will vary. Use the confidence score threshold to handle ambiguous inputs gracefully.

## Training Details

- **Base model:** distilbert-base-uncased
- **Training data:** 8,987 examples (synthetic, template-generated with natural language variation)
- **Validation:** 1,123 examples
- **Test:** 1,124 examples
- **Epochs:** 3 (with early stopping, patience=2)
- **Learning rate:** 2e-5
- **Batch size:** 32
- **Max sequence length:** 128
- **Training time:** ~100 seconds on NVIDIA RTX 4070
- **Loss:** 0.0015 (training) / 0.0017 (validation)

## Limitations

- Trained on English text only
- Template-generated training data may not cover all edge cases
- Ambiguous messages (e.g., "help me with the API code") may get lower confidence scores — use the confidence threshold to fall back to an LLM
- Not designed for multi-intent messages (e.g., "search for X and write code for Y")

## License

Apache 2.0 — use it however you want, commercial included.

## Citation

If you use this model, a star on the repo is appreciated!