--- language: - en license: apache-2.0 base_model: distilbert/distilbert-base-uncased tags: - text-classification - guardrail - safety - cbt - mental-health - academic-stress pipeline_tag: text-classification --- # ReframeBot-Guardrail-DistilBERT A 3-class text classifier fine-tuned from `distilbert-base-uncased` that routes conversation turns to one of three task modes: | Label | Meaning | |---|---| | `TASK_1` | CBT / academic stress — engage with Socratic questioning | | `TASK_2` | Crisis / self-harm signal — redirect to emergency hotlines | | `TASK_3` | Out-of-scope — validate feeling, pivot back to academics | This model is the guardrail component of the ReframeBot system. Its output is combined with a dual-signal crisis detector (regex + cosine similarity) before the final routing decision is made. ## Usage ```python from transformers import pipeline classifier = pipeline( "text-classification", model="Nhatminh1234/ReframeBot-Guardrail-DistilBERT", ) classifier("I'm really stressed about my finals next week") # [{'label': 'TASK_1', 'score': 0.97}] classifier("What's a good recipe for pasta?") # [{'label': 'TASK_3', 'score': 0.94}] ``` ## Training Details | Hyperparameter | Value | |---|---| | Base model | distilbert-base-uncased | | Number of labels | 3 (TASK_1, TASK_2, TASK_3) | | Learning rate | 2e-6 | | Batch size | 16 | | Max epochs | 20 (early stopping, patience=3) | | Weight decay | 0.01 | | Max token length | 128 | | Best model criterion | macro F1 | | Hardware | NVIDIA RTX 5070 (laptop, 8 GB VRAM) | **Dataset:** 1,674 labelled samples (80/20 train/val split). Includes hard negatives — benign metaphors that superficially resemble crisis language (e.g., "dying of embarrassment after that presentation"). ## Evaluation Per-class results on the validation split (335 samples): | Class | Precision | Recall | F1 | Support | |---|---|---|---|---| | TASK_1 | 0.99 | 1.00 | 1.00 | 107 | | TASK_2 | 0.98 | 1.00 | 0.99 | 91 | | TASK_3 | 1.00 | 0.98 | 0.99 | 137 | | **macro avg** | **0.99** | **0.99** | **0.99** | **335** | Accuracy on a separate, harder held-out test set: **91.1%** (includes boundary cases not present in the training distribution). ## Intended Use Designed as a routing component in the ReframeBot system. The TASK_2 output alone is not sufficient for crisis intervention — the full system also applies a regex + semantic similarity layer before acting on a crisis signal. ## Project GitHub: [ReframeBot](https://github.com/minhnghiem32131024429/ReframeBot)