hybrid-intent-crossencoder-minilm-L12-v2

Cross-encoder reranker for enterprise intent detection (DLP / security). Fine-tuned from cross-encoder/ms-marco-MiniLM-L12-v2 on a hybrid synthetic intent-detection dataset.

Input format

user_input [SEP] intent_description

Performance (held-out test set, threshold=0.3)

Metric Value
Recall 0.9913
Precision 0.9522
F1 0.9714
AUC-ROC 0.9978
PR-AUC 0.9982
Best threshold (F1-optimal) 0.8819 → F1=0.9820

Training config

Parameter Value
Base model cross-encoder/ms-marco-MiniLM-L12-v2
Batch size 64
Grad accum steps 1
Effective batch 64
Learning rate 2e-05
Label smoothing 0.05
Warmup ratio 0.06
Max sequence length 256
Early stopping recall@0.3 (patience=3)
Epochs trained 10
Training time 5.6 min

Inference snippet

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_id  = "aryasuneesh-quilr/hybrid-intent-crossencoder-minilm-L12-v2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model     = AutoModelForSequenceClassification.from_pretrained(model_id)
model.eval()

def score(user_input: str, intent_description: str) -> float:
    pair = f"{user_input} [SEP] {intent_description}"
    enc  = tokenizer(pair, return_tensors="pt", truncation=True, max_length=256)
    with torch.no_grad():
        logits = model(**enc).logits
    return torch.softmax(logits, dim=1)[0, 1].item()   # P(match)

# Example
s = score(
    "Our AWS_SECRET_ACCESS_KEY was found in a public repo",
    "Identify exposure of authentication credentials or API keys"
)
print(f"Match probability: {s:.4f}")   # use threshold 0.8819 for best F1

Files in this repo

File Description
model.safetensors HF-native weights
best_model.pt Raw PyTorch state_dict (for resuming training)
training_config.json Full hyperparameter record
metrics/ Per-epoch + test-set evaluation CSVs

Generated 2026-02-24 10:20 UTC by ablation_reranker_training.py

Downloads last month
1
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aryasuneesh-quilr/hybrid-intent-crossencoder-minilm-L12-v2