MiniLM-L12-Grape-Route: Semantic Router for SysAdmin Voice Commands
Model Description
Grape-Route is a specialized text classification model designed to act as a semantic router for a system administration voice assistant. It is fine-tuned from microsoft/Multilingual-MiniLM-L12-H384.
The model's primary goal is to interpret natural language commands (Spanish input with technical English terms) and route them to the appropriate technical subsystem. It has been specifically trained to be robust against phonetic errors and typos typical of Speech-to-Text (STT) engines like Vosk (e.g., interpreting "docar" as "docker", "pin" as "ping", or "ese ese ache" as "ssh").
Intended Use
This model is intended to be the first layer of an intent recognition pipeline. It takes a raw string (transcribed from voice) and returns a categorical label with a confidence score.
Supported Categories (Labels)
The model classifies inputs into 8 specific distinct intents, coded with wine-based code names:
| Label | Domain | Description |
|---|---|---|
| malbec | Docker Management | Containers, images, volumes, logs (e.g., "run nginx", "stop db"). |
| syrah | Networking | Connectivity, ping, ports, IP, DNS (e.g., "check internet", "my ip"). |
| tempranillo | SysAdmin | System processes, users, services, resources (e.g., "kill process", "create user", "check ram"). |
| pinot | Search | File search, grep, find, localization (e.g., "find logs", "where is python"). |
| chardonnay | File Management | Local file manipulation (e.g., "create folder", "delete file", "list directory"). |
| cabernet | Remote Access | SSH connections, SCP transfers, tunneling (e.g., "connect to server", "send file to vps"). |
| gemma | General / Chat | General knowledge questions, trivia, chit-chat (e.g., "tell me a joke", "capital of France"). |
| null | Out of Domain | Irrelevant queries, personal questions, or noise (e.g., "order pizza", "call mom"). |
Training Data
The model was trained on a custom dataset containing approximately 1,500+ samples, consisting of:
- Real CLI Commands: Translated from natural language (based on datasets like nl2bash).
- Synthetic Variations: Grammatical variations of common sysadmin requests.
- Adversarial STT Noise: The dataset was heavily augmented with phonetic corruptions to simulate Vosk errors in Spanish (e.g., "doquer", "pines", "rut", "suda").
Performance and Reliability
Based on inference tests, the model exhibits the following behavior:
- High Confidence (>85%): Technical commands for Docker (
malbec), Files (chardonnay), and SysAdmin (tempranillo) are detected with high precision, even with misspellings. - Medium Confidence (~55-65%): Remote file transfers (SCP) involving long sentences may sometimes overlap with local file management. It is recommended to implement a confirmation logic if confidence is below 75% for destructive or remote actions.
- OOD Rejection: Non-technical inputs are reliably classified as
gemma(General) ornull(Noise), typically with low confidence scores, allowing the system to safely ignore them.
How to Get Started
You can use this model with the Hugging Face pipeline API:
from transformers import pipeline
# Load the model
router = pipeline("text-classification", model="jrodriiguezg/minilm-l12-grape-route")
# Inference examples
commands = [
"levanta un contenedor de nginx", # Standard Docker command
"haz un pin a google", # Network command with STT noise ("pin" instead of "ping")
"borra el archivo de configuracion", # File management
"cuentame un chiste" # General chat
]
for cmd in commands:
result = router(cmd)
print(f"Command: {cmd} -> {result}")
Limitations Language: The model is optimized for Spanish inputs containing English technical jargon. It may not perform well on pure English sentences or other languages.
Context Window: As a BERT-based model, it analyzes single sentences. It does not retain conversational context (history).
SCP Ambiguity: Complex sentences requesting file transfers to remote servers ("move this file to the server") may occasionally be misclassified as local file management (chardonnay) instead of remote (cabernet).
- Downloads last month
- 10
Model tree for jrodriiguezg/minilm-l12-grape-route
Base model
microsoft/Multilingual-MiniLM-L12-H384