Email Phishing Detection Model (RoBERTa + LoRA)
Classifies email content as Safe or Phishing.
Built on FacebookAI/roberta-base with LoRA fine-tuning.
Labels
LABEL_0/SafeLABEL_1/Phishing
Model Details
- Base model:
FacebookAI/roberta-base - Task: Binary text classification
- LoRA parameters:
r=16,alpha=32,dropout=0.1 - LoRA target modules:
query,value - Max input length:
128tokens - Weights file in this project:
roberta_lora_phishing_detector.pt
How Inference Works (Model-Level)
- Build input text from email subject + body.
- Tokenize with RoBERTa tokenizer (
padding=max_length,truncation=True,max_length=128). - Run sequence classification forward pass.
- Apply softmax to logits.
- Return label + confidence and class probabilities.
How Aegis DLP Uses This Model (System-Level)
In Aegis DLP, this model is one signal in a multi-factor phishing decision pipeline:
- AI body analysis:
40% - URL analysis against trusted domains:
25% - Attachment analysis (YARA):
15% - Content keyword heuristics:
10% - Sender trust / suspicious TLD signal:
10%
Decision policy used in the DLP app:
- If any factor
>= 0.85-> forcePhishing(single-factor escalation) - Else if weighted score
>= 0.90->Safe - Else if weighted score
>= 0.35->Phishing - Else ->
Safewithneeds_review=true
Additional app rules around this model:
- Trusted sender/domain whitelist can short-circuit to
Safe - Non-English content is routed to
Unknownand review in app flow
Usage (transformers pipeline)
from transformers import pipeline
clf = pipeline("text-classification", model="YOUR_ORG/YOUR_MODEL")
print(clf("Urgent: Verify your account now to avoid suspension."))
Usage in This DLP Codebase
from modules.body_classifier import predict_body_label
label, confidence, probs = predict_body_label(
"Urgent: Verify your account now to avoid suspension."
)
print(label, confidence, probs)
Intended Use
- Email phishing risk triage
- SOC/security analyst support workflows
- Security-aware email filtering pipelines
Limitations
- Primarily designed for English email content.
- Final DLP outcome is multi-factor and not based on this model alone.
- Confidence scores are model outputs and may not be calibrated risk probabilities.
Responsible Use
Use this model to assist review workflows, not as the sole basis for high-impact security or compliance decisions.
- Downloads last month
- 35