Spaces:

Esvanth
/

mindscan

Running

App Files Files Community

Esvanth commited on Apr 20

Commit

3bfc784

verified ·

1 Parent(s): 7d675ba

Add README.md

Browse files

Files changed (1) hide show

README.md +161 -5

README.md CHANGED Viewed

@@ -1,10 +1,166 @@
 ---
-title: Mindscan
-emoji: 📉
-colorFrom: green
-colorTo: red
 sdk: docker
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: MindScan
+emoji: 🧠
+colorFrom: indigo
+colorTo: purple
 sdk: docker
+app_port: 7860
 pinned: false
 ---
+# MindScan — Mental Health Detection System
+### NCI H9DAI Research Project 2026 · MSc Artificial Intelligence
+A multi-model mental health text analysis system that runs 12 ML classifiers across 3 datasets simultaneously, returning depression type, binary depression likelihood, and suicide risk scores for any input text.
+---
+## Project Structure
+```
+MindScan/
+├── app.py              Flask backend — start here
+├── predict.py          Prediction logic (all 12 models)
+├── requirements.txt    Python dependencies
+├── README.md           This file
+├── templates/
+│   └── index.html      UI (served by Flask at localhost:5000)
+├── models/
+│   ├── classical/      Download from Google Drive (see below)
+│   └── transformers/   Download from Google Drive (see below)
+└── notebooks/
+    ├── DA_Notebook_One.ipynb   Classical model training
+    └── DA_2_Notebook.ipynb     XLM-RoBERTa + comparison
+```
+---
+## Github Link
+https://github.com/Amod069/MindScan
+## Setup
+### 1. Download model files from Google Drive
+Download `MindScan_Models/` from Google Drive and place the contents like this:
+https://drive.google.com/drive/folders/16jfsPUcdekDWqtk4evTjQHQO2YoKJdpQ?usp=sharing
+```
+models/
+├── classical/
+│   ├── le_d1.pkl, le_d2.pkl, le_d3.pkl
+│   ├── tfidf_d1.pkl, tfidf_d2.pkl, tfidf_d3.pkl
+│   ├── logistic_regression_d1.pkl, _d2.pkl, _d3.pkl
+│   ├── svm_d1.pkl, _d2.pkl, _d3.pkl
+│   └── xgboost_d1.pkl, _d2.pkl, _d3.pkl
+└── transformers/
+    ├── xlmr_d1_final/
+    ├── xlmr_d2_final/
+    └── xlmr_d3_final/
+```
+### 2. Create Python environment
+```bash
+python -m venv venv
+# Mac/Linux
+source venv/bin/activate
+# Windows
+venv\Scripts\activate
+```
+### 3. Install dependencies
+```bash
+pip install -r requirements.txt
+```
+### 4. Run the server
+```bash
+python app.py
+```
+### 5. Open the UI
+```
+http://localhost:5000
+```
+**Note:** First startup takes ~30 seconds while XLM-RoBERTa models load into memory.
+---
+## The 3 Datasets
+| | Dataset | Source | Size | Task |
+|---|---|---|---|---|
+| D1 | Nusrat et al. (2024) | Zenodo 14233292 | 14,983 tweets | 6-class depression type |
+| D2 | albertobellardini | Kaggle | 10,314 tweets | Binary depression |
+| D3 | nikhileswarkomati | Kaggle | 50,000 Reddit posts | Binary suicide risk |
+## The 4 Models (per dataset = 12 total)
+1. **Logistic Regression** — simple linear baseline
+2. **SVM (LinearSVC)** — classical NLP gold standard
+3. **XGBoost** — gradient boosting
+4. **XLM-RoBERTa** — transformer, contextual embeddings
+*Note: Random Forest excluded from deployment (646 MB files, worst performer on D1/D3).*
+---
+## Real Results
+| Dataset | Best Model | Macro F1 | Cohen's Kappa |
+|---|---|---|---|
+| D1 Depression Type | **SVM** | 0.9269 | 0.9072 |
+| D2 Binary Depression | **XLM-RoBERTa** | 0.9993 | 0.9986 |
+| D3 Suicide Risk | **XLM-RoBERTa** | 0.9810 | 0.9620 |
+**Key finding:** SVM outperforms XLM-RoBERTa on 6-class psychiatric classification (D1). All models exceed the Tumaliuan et al. (2024) benchmark of F1=0.81.
+---
+## API
+**POST /predict**
+```json
+// Request
+{ "text": "your text here" }
+// Response
+{
+  "dataset1": {
+    "task": "Depression Type (6 Classes)",
+    "models": {
+      "Logistic Regression": { "label": "postpartum", "confidence": 0.958 },
+      "SVM":                  { "label": "postpartum", "confidence": 0.828 },
+      "XGBoost":              { "label": "postpartum", "confidence": 0.999 },
+      "XLM-RoBERTa":         { "label": "postpartum", "confidence": 0.997 }
+    },
+    "winner_model": "XGBoost",
+    "winner_prediction": "postpartum",
+    "winner_confidence": 0.999,
+    "class_probs": { "postpartum": 0.997, "bipolar": 0.001, ... }
+  },
+  "dataset2": { ... },
+  "dataset3": { ... },
+  "risk_flag": false,
+  "suicide_votes": "0/4 models flagged suicide risk",
+  "processing_time_ms": 2341
+}
+```
+**GET /health**
+```json
+{ "status": "ok", "models_ready": true }
+```
+---
+## Disclaimer
+This system is a research prototype built for academic coursework. It is **not** a clinical tool and must never be used for actual medical diagnosis or mental health assessment. All datasets are from publicly available sources for research purposes only.
+---
+*NCI H9DAI · Data Analytics for Artificial Intelligence · 2026*