docs(readme): document MRI 2D, clinical RAG, EEG stub; bump test count to 330+2
Browse filesAdds Day-10 (fusion engine) and Day-11 (external assets integration) rows to
the status table, plus three new feature sections:
- 'MRI Deep-Learning Backends' — volumetric_onnx default vs resnet18_2d
(4-class Alzheimer's). Streamlit auto-adapts the form; switch via
MRI_MODEL_KIND env without restarting workers.
- 'Clinical Corpus (TF-IDF, Turkish + English)' — 14-PDF index with
TR->EN query expansion (egzersiz/beslenme/unutkanlik/...). Agent calls
retrieve_context(corpus="clinical"); CLI smoke at scripts/clinical_rag_smoke.py.
- 'EEG Pretrained Classifier (stub-able for demo)' — POST /predict/eeg loads
any sklearn predict_proba joblib. Default stub flows into fusion as the
eeg modality with zero code changes when the real artifact arrives.
Updates Quick Start and Executive Summary test counts to 330+2 and lists
the new env-var demo lifelines (MRI_MODEL_PATH_2D, EEG_CLF_ARTIFACT,
CLINICAL_RAG_INDEX_PATH).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@@ -25,7 +25,7 @@ short_description: Living decision system for BBB, EEG, and MRI clinical ML
|
|
| 25 |
|
| 26 |
**4.** Adapt-Over-Time is built in: each FastAPI worker keeps a rolling 100-prediction window; the trailing median is z-scored against the train-time confidence distribution and surfaced both in the API response and the UI ("trailing-100 confidence median is +1.42σ from training distribution — mild distribution shift").
|
| 27 |
|
| 28 |
-
**5.** Current verification:
|
| 29 |
|
| 30 |
## Status
|
| 31 |
|
|
@@ -40,6 +40,8 @@ short_description: Living decision system for BBB, EEG, and MRI clinical ML
|
|
| 40 |
| 7 | Final 5% (Drift, Traceability & Agents) | Per-worker drift z-score + MLflow provenance badge + `POST /explain/bbb` (LLM + template fallback) + AI Assistant tab | Shipped |
|
| 41 |
| 8 | Grand Finale (Multi-Modal Agents, Track 5 & Public Deploy) | Multi-modal explainers + experiments + deploy surface | Shipped |
|
| 42 |
| 9 | Agent/RAG hardening + MRI DL decision layer | Guarded orchestration + `POST /predict/mri` ONNX surface | Shipped — 242 passed, 2 skipped |
|
|
|
|
|
|
|
| 43 |
|
| 44 |
### Fusion Engine
|
| 45 |
|
|
@@ -54,6 +56,60 @@ heuristic — adjust there. **BBB is intentionally NOT a fusion modality**:
|
|
| 54 |
it is a researcher-side concern (drug permeability) and stays decoupled
|
| 55 |
from disease classification.
|
| 56 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
## Quick Start
|
| 58 |
|
| 59 |
**Prerequisite:** Python 3.10–3.12. The pinned `requirements.txt` has no cp313+ wheels;
|
|
@@ -63,7 +119,7 @@ from disease classification.
|
|
| 63 |
# 1. Create venv and install
|
| 64 |
python3.12 -m venv .venv312 && source .venv312/bin/activate && pip install -r requirements.txt
|
| 65 |
|
| 66 |
-
# 2. Verify — current full suite:
|
| 67 |
pytest -v
|
| 68 |
|
| 69 |
# 3. Smoke run with the bundled 6-row fixture
|
|
|
|
| 25 |
|
| 26 |
**4.** Adapt-Over-Time is built in: each FastAPI worker keeps a rolling 100-prediction window; the trailing median is z-scored against the train-time confidence distribution and surfaced both in the API response and the UI ("trailing-100 confidence median is +1.42σ from training distribution — mild distribution shift").
|
| 27 |
|
| 28 |
+
**5.** Current verification: 330 passed, 2 skipped. Demo lifelines (`NEUROBRIDGE_DISABLE_MLFLOW=1`, `NEUROBRIDGE_DISABLE_LLM=1`, `BBB_MODEL_PATH`, `MRI_MODEL_PATH`, `MRI_MODEL_PATH_2D`, `EEG_CLF_ARTIFACT`, `CLINICAL_RAG_INDEX_PATH`) keep the system usable when MLflow, OpenRouter, or model artifacts are unavailable.
|
| 29 |
|
| 30 |
## Status
|
| 31 |
|
|
|
|
| 40 |
| 7 | Final 5% (Drift, Traceability & Agents) | Per-worker drift z-score + MLflow provenance badge + `POST /explain/bbb` (LLM + template fallback) + AI Assistant tab | Shipped |
|
| 41 |
| 8 | Grand Finale (Multi-Modal Agents, Track 5 & Public Deploy) | Multi-modal explainers + experiments + deploy surface | Shipped |
|
| 42 |
| 9 | Agent/RAG hardening + MRI DL decision layer | Guarded orchestration + `POST /predict/mri` ONNX surface | Shipped — 242 passed, 2 skipped |
|
| 43 |
+
| 10 | Multi-modal fusion engine | `POST /fusion/predict` + `run_fusion` agent tool — MRI + EEG + clinical scores → per-disease confidence with attribution | Shipped — 295 passed, 1 skipped |
|
| 44 |
+
| 11 | External assets integration | 2D resnet18 MRI Alzheimer's path · TF-IDF clinical RAG with TR query expansion · stub-able EEG pretrained classifier | Shipped — 330 passed, 2 skipped |
|
| 45 |
|
| 46 |
### Fusion Engine
|
| 47 |
|
|
|
|
| 56 |
it is a researcher-side concern (drug permeability) and stays decoupled
|
| 57 |
from disease classification.
|
| 58 |
|
| 59 |
+
### MRI Deep-Learning Backends
|
| 60 |
+
|
| 61 |
+
The MRI prediction route supports two backends, selected via env at request time:
|
| 62 |
+
|
| 63 |
+
- `MRI_MODEL_KIND=volumetric_onnx` (default). Loads an ONNX volumetric model
|
| 64 |
+
from `MRI_MODEL_PATH` (default `data/processed/mri_model.onnx`). Input:
|
| 65 |
+
`.nii` / `.nii.gz`. Two-class output by default (`control`, `abnormal`).
|
| 66 |
+
- `MRI_MODEL_KIND=resnet18_2d`. Loads a PyTorch state_dict from
|
| 67 |
+
`MRI_MODEL_PATH_2D` (default `data/processed/mri_dl_2d/best_model.pt`).
|
| 68 |
+
Input: 2D image (`.png` / `.jpg`). 4-class Alzheimer's classifier:
|
| 69 |
+
`MildDemented`, `ModerateDemented`, `NonDemented`, `VeryMildDemented`.
|
| 70 |
+
Trainer's BEST_PARAMS bake in: `image_size=160`, ImageNet normalisation,
|
| 71 |
+
resnet18 backbone with a 4-class head.
|
| 72 |
+
|
| 73 |
+
The Streamlit `Predict` tab auto-adapts its form to the active backend.
|
| 74 |
+
Switch backends without restarting workers — env is read on each request.
|
| 75 |
+
|
| 76 |
+
### Clinical Corpus (TF-IDF, Turkish + English)
|
| 77 |
+
|
| 78 |
+
A second RAG index covers 14 peer-reviewed PDFs (Alzheimer's, Parkinson's,
|
| 79 |
+
lifestyle, nutrition, exercise) using TF-IDF + sklearn. Source PDFs at
|
| 80 |
+
`data/external_rag/clinical_pdfs/` (gitignored — copy from the team
|
| 81 |
+
shared drive); pre-built index at `data/external_rag/index/rag_index.pkl`.
|
| 82 |
+
|
| 83 |
+
Agent invocation:
|
| 84 |
+
|
| 85 |
+
```python
|
| 86 |
+
retrieve_context(query="egzersiz Alzheimer feedback", corpus="clinical", k=5)
|
| 87 |
+
```
|
| 88 |
+
|
| 89 |
+
Local CLI smoke:
|
| 90 |
+
|
| 91 |
+
```bash
|
| 92 |
+
python scripts/clinical_rag_smoke.py "egzersiz Alzheimer feedback"
|
| 93 |
+
```
|
| 94 |
+
|
| 95 |
+
The Turkish keywords `alzheimer`, `parkinson`, `egzersiz`, `beslenme`,
|
| 96 |
+
`tani`, `tedavi`, `risk`, `unutkanlik`, `titreme`, `demans` auto-expand
|
| 97 |
+
to English equivalents so Turkish queries hit English chunks.
|
| 98 |
+
|
| 99 |
+
### EEG Pretrained Classifier (stub-able for demo)
|
| 100 |
+
|
| 101 |
+
`POST /predict/eeg` runs an sklearn-style classifier (any `predict_proba`
|
| 102 |
+
interface) on a feature vector and returns probability + attribution. The
|
| 103 |
+
artifact loads from `data/processed/eeg_clf.joblib` (override via
|
| 104 |
+
`EEG_CLF_ARTIFACT`). Default labels are `(control, alzheimers)` — override
|
| 105 |
+
via `EEG_CLF_LABELS=label0,label1,...`.
|
| 106 |
+
|
| 107 |
+
For the hackathon demo a synthetic stub
|
| 108 |
+
(`tests/fixtures/build_dummy_eeg_clf.py`) is acceptable — drop the real
|
| 109 |
+
`.joblib` at the artifact path to swap in production weights with **zero
|
| 110 |
+
code changes**. The fusion engine consumes this prediction as the `eeg`
|
| 111 |
+
modality automatically.
|
| 112 |
+
|
| 113 |
## Quick Start
|
| 114 |
|
| 115 |
**Prerequisite:** Python 3.10–3.12. The pinned `requirements.txt` has no cp313+ wheels;
|
|
|
|
| 119 |
# 1. Create venv and install
|
| 120 |
python3.12 -m venv .venv312 && source .venv312/bin/activate && pip install -r requirements.txt
|
| 121 |
|
| 122 |
+
# 2. Verify — current full suite: 330 passed, 2 skipped
|
| 123 |
pytest -v
|
| 124 |
|
| 125 |
# 3. Smoke run with the bundled 6-row fixture
|