ReframeBot-Guardrail-DistilBERT
A 3-class text classifier fine-tuned from distilbert-base-uncased that
routes conversation turns to one of three task modes:
| Label | Meaning |
|---|---|
TASK_1 |
CBT / academic stress — engage with Socratic questioning |
TASK_2 |
Crisis / self-harm signal — redirect to emergency hotlines |
TASK_3 |
Out-of-scope — validate feeling, pivot back to academics |
This model is the guardrail component of the ReframeBot system. Its output is combined with a dual-signal crisis detector (regex + cosine similarity) before the final routing decision is made.
Usage
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="Nhatminh1234/ReframeBot-Guardrail-DistilBERT",
)
classifier("I'm really stressed about my finals next week")
# [{'label': 'TASK_1', 'score': 0.97}]
classifier("What's a good recipe for pasta?")
# [{'label': 'TASK_3', 'score': 0.94}]
Training Details
| Hyperparameter | Value |
|---|---|
| Base model | distilbert-base-uncased |
| Number of labels | 3 (TASK_1, TASK_2, TASK_3) |
| Learning rate | 2e-6 |
| Batch size | 16 |
| Max epochs | 20 (early stopping, patience=3) |
| Weight decay | 0.01 |
| Max token length | 128 |
| Best model criterion | macro F1 |
| Hardware | NVIDIA RTX 5070 (laptop, 8 GB VRAM) |
Dataset: 1,674 labelled samples (80/20 train/val split). Includes hard negatives — benign metaphors that superficially resemble crisis language (e.g., "dying of embarrassment after that presentation").
Evaluation
Per-class results on the validation split (335 samples):
| Class | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| TASK_1 | 0.99 | 1.00 | 1.00 | 107 |
| TASK_2 | 0.98 | 1.00 | 0.99 | 91 |
| TASK_3 | 1.00 | 0.98 | 0.99 | 137 |
| macro avg | 0.99 | 0.99 | 0.99 | 335 |
Accuracy on a separate, harder held-out test set: 91.1% (includes boundary cases not present in the training distribution).
Intended Use
Designed as a routing component in the ReframeBot system. The TASK_2 output alone is not sufficient for crisis intervention — the full system also applies a regex + semantic similarity layer before acting on a crisis signal.
Project
GitHub: ReframeBot
- Downloads last month
- -
Model tree for Nhatminh1234/ReframeBot-Guardrail-DistilBERT
Base model
distilbert/distilbert-base-uncased