--- title: NeuroBridge Enterprise emoji: 🧠 colorFrom: blue colorTo: indigo sdk: docker app_file: src/frontend/app.py app_port: 7860 pinned: false license: mit short_description: Living decision system for BBB, EEG, and MRI clinical ML --- # NeuroBridge Enterprise > **Trust-engineered clinical-ML platform for neuroscience labs and health systems.** ## Executive Summary **1.** Multi-site clinical ML pipelines fail in production because they assume clean data, single-site distributions, and black-box trust β€” all of which break in real labs. NeuroBridge Enterprise is the *living decision system* that closes those three gaps end-to-end across BBB drug-screening, EEG signal-cleaning, and MRI multi-site harmonization. **2.** Three production pipelines (RDKit + Morgan, MNE+ICA, neuroHarmonize ComBat) sit behind one FastAPI surface and one Streamlit dashboard, with decision layers on top: a Random Forest BBB classifier today and an MRI image ONNX inference surface ready for an externally-trained volumetric deep-learning model. The agent surface can route a user request to exactly one pipeline tool, retrieve FAISS-backed context, and synthesize a cited answer. **3.** Robustness is demoed live: a curated edge-case dropdown probes invalid SMILES, OOD molecules, and boundary inputs β€” the system never crashes, always degrades gracefully (HTTP 400 β†’ recoverable warning, low confidence + lower drift score, calibration caption hedge). **4.** Adapt-Over-Time is built in: each FastAPI worker keeps a rolling 100-prediction window; the trailing median is z-scored against the train-time confidence distribution and surfaced both in the API response and the UI ("trailing-100 confidence median is +1.42Οƒ from training distribution β€” mild distribution shift"). **5.** Current verification: 330 passed, 2 skipped. Demo lifelines (`NEUROBRIDGE_DISABLE_MLFLOW=1`, `NEUROBRIDGE_DISABLE_LLM=1`, `BBB_MODEL_PATH`, `MRI_MODEL_PATH`, `MRI_MODEL_PATH_2D`, `EEG_CLF_ARTIFACT`, `CLINICAL_RAG_INDEX_PATH`) keep the system usable when MLflow, OpenRouter, or model artifacts are unavailable. ## Status | Day | Modality | Pipeline | Status | |-----|----------|----------|--------| | 1 | Tabular (BBB / molecules) | [`bbb_pipeline.py`](src/pipelines/bbb_pipeline.py) | Shipped | | 2 | Signal (EEG) | [`eeg_pipeline.py`](src/pipelines/eeg_pipeline.py) | Shipped | | 3 | Image (MRI / fMRI) | [`mri_pipeline.py`](src/pipelines/mri_pipeline.py) | Shipped | | 4 | API + MLOps + Frontend | FastAPI + MLflow + Streamlit + Docker | Shipped | | 5 | Decision Layer (Model + XAI + Interactive UI) | [`bbb_model.py`](src/models/bbb_model.py) β€” RandomForest + SHAP + `POST /predict/bbb` | Shipped | | 6 | Final Polish & Demo Features (Edge cases + Calibration + ComBat viz) | Calibration metadata + edge-case probes + `POST /pipeline/mri/diagnostics` | Shipped | | 7 | Final 5% (Drift, Traceability & Agents) | Per-worker drift z-score + MLflow provenance badge + `POST /explain/bbb` (LLM + template fallback) + AI Assistant tab | Shipped | | 8 | Grand Finale (Multi-Modal Agents, Track 5 & Public Deploy) | Multi-modal explainers + experiments + deploy surface | Shipped | | 9 | Agent/RAG hardening + MRI DL decision layer | Guarded orchestration + `POST /predict/mri` ONNX surface | Shipped β€” 242 passed, 2 skipped | | 10 | Multi-modal fusion engine | `POST /fusion/predict` + `run_fusion` agent tool β€” MRI + EEG + clinical scores β†’ per-disease confidence with attribution | Shipped β€” 295 passed, 1 skipped | | 11 | External assets integration | 2D resnet18 MRI Alzheimer's path Β· TF-IDF clinical RAG with TR query expansion Β· stub-able EEG pretrained classifier | Shipped β€” 330 passed, 2 skipped | | 12 | DCE-MRI BBB bridge + drug-dose adjuster | `POST /predict/bbb_permeability_map` (heuristic_proxy or dce_onnx) + `POST /research/drug_dose_adjustment` + Researcher Streamlit tab + `compute_bbb_leakage_score` & `adjust_drug_dose` agent tools | Shipped | ### Fusion Engine `POST /fusion/predict` (and the agent tool `run_fusion`) combines whichever of MRI, EEG, and clinical-test scores (MMSE, MoCA, UPDRS, gait, age) the doctor has uploaded into a per-disease confidence (Alzheimer's, Parkinson's, other) with full attribution showing how much each modality contributed. Missing modalities are skipped, not imputed β€” the engine renormalises onto whichever inputs are present so absence naturally lowers confidence rather than silently inflating it. Weights live in `src/fusion/weights.py` and are heuristic β€” adjust there. **BBB is intentionally NOT a fusion modality**: it is a researcher-side concern (drug permeability) and stays decoupled from disease classification. ### MRI Deep-Learning Backends The MRI prediction route supports two backends, selected via env at request time: - `MRI_MODEL_KIND=volumetric_onnx` (default). Loads an ONNX volumetric model from `MRI_MODEL_PATH` (default `data/processed/mri_model.onnx`). Input: `.nii` / `.nii.gz`. Two-class output by default (`control`, `abnormal`). - `MRI_MODEL_KIND=resnet18_2d`. Loads a PyTorch state_dict from `MRI_MODEL_PATH_2D` (default `data/processed/mri_dl_2d/best_model.pt`). Input: 2D image (`.png` / `.jpg`). 4-class Alzheimer's classifier: `MildDemented`, `ModerateDemented`, `NonDemented`, `VeryMildDemented`. Trainer's BEST_PARAMS bake in: `image_size=160`, ImageNet normalisation, resnet18 backbone with a 4-class head. The Streamlit `Predict` tab auto-adapts its form to the active backend. Switch backends without restarting workers β€” env is read on each request. ### Clinical Corpus (TF-IDF, Turkish + English) A second RAG index covers 14 peer-reviewed PDFs (Alzheimer's, Parkinson's, lifestyle, nutrition, exercise) using TF-IDF + sklearn. Source PDFs at `data/external_rag/clinical_pdfs/` (gitignored β€” copy from the team shared drive); pre-built index at `data/external_rag/index/rag_index.pkl`. Agent invocation: ```python retrieve_context(query="egzersiz Alzheimer feedback", corpus="clinical", k=5) ``` Local CLI smoke: ```bash python scripts/clinical_rag_smoke.py "egzersiz Alzheimer feedback" ``` The Turkish keywords `alzheimer`, `parkinson`, `egzersiz`, `beslenme`, `tani`, `tedavi`, `risk`, `unutkanlik`, `titreme`, `demans` auto-expand to English equivalents so Turkish queries hit English chunks. ### DCE-MRI BBB Bridge + Drug-Dose Adjuster (Researcher persona) Clinical fact: Dynamic Contrast-Enhanced (DCE) MRI measures BBB leakage by tracking gadolinium contrast washout. A leaky BBB lets drugs cross into the brain at unsafe levels, so concentrations need revising. This is the **only legitimate place where BBB and MRI couple** in the platform β€” the Researcher lane only. The fusion engine's "BBB is NOT a diagnostic modality" rule is preserved. **`POST /predict/bbb_permeability_map`** β€” two modes: - `heuristic_proxy` (default, demo-ready): reuses the 2D resnet18 Alzheimer's classifier; score = `1 - P(NonDemented)`. Anchored in the published correlation between disease severity and BBB breakdown. - `dce_onnx` (real DCE artifact, swap-in later): loads an ONNX model trained on 4D DCE-MRI data, emits a Ktrans map normalised to `[0, 1]`. Drop the artifact at `data/processed/bbb_permeability_dce.onnx` (or set `BBB_PERMEABILITY_DCE_PATH`). **`POST /research/drug_dose_adjustment`** β€” pure-function logic: | BBB score | Drug BBB-permeable | Recommended dose | |---|---|---| | < 0.20 (intact) | any | 100% of baseline (low risk) | | β‰₯ 0.20 (leaky) | yes | `max(30%, 1 βˆ’ 0.7Β·score)` of baseline (moderate / high risk) | | β‰₯ 0.20 (leaky) | no | `max(60%, 1 βˆ’ 0.4Β·score)` of baseline (moderate risk) | | β‰₯ 0.20 (leaky) | unknown | treated as permeable (safer assumption) | When `smiles` is supplied, the BBB classifier auto-resolves the drug's permeability β€” closes the researcher loop end-to-end. The rationale always includes the sentence "Research suggestion, not medical advice." Streamlit `Researcher` tab combines both into a single 2-column flow: left side picks an MRI image and runs the leakage scorer; right side takes a SMILES + baseline dose and computes a revised dose with risk badge and rationale card. Agent tools (orchestrator-callable): - `compute_bbb_leakage_score` β€” wraps `/predict/bbb_permeability_map`. - `adjust_drug_dose` β€” wraps `/research/drug_dose_adjustment`. ### EEG Pretrained Classifier (stub-able for demo) `POST /predict/eeg` runs an sklearn-style classifier (any `predict_proba` interface) on a feature vector and returns probability + attribution. The artifact loads from `data/processed/eeg_clf.joblib` (override via `EEG_CLF_ARTIFACT`). Default labels are `(control, alzheimers)` β€” override via `EEG_CLF_LABELS=label0,label1,...`. For the hackathon demo a synthetic stub (`tests/fixtures/build_dummy_eeg_clf.py`) is acceptable β€” drop the real `.joblib` at the artifact path to swap in production weights with **zero code changes**. The fusion engine consumes this prediction as the `eeg` modality automatically. ## Quick Start **Prerequisite:** Python 3.10–3.12. The pinned `requirements.txt` has no cp313+ wheels; `.python-version` pins to 3.12. ```bash # 1. Create venv and install python3.12 -m venv .venv312 && source .venv312/bin/activate && pip install -r requirements.txt # 2. Verify β€” current full suite: 330 passed, 2 skipped pytest -v # 3. Smoke run with the bundled 6-row fixture mkdir -p data/raw && cp tests/fixtures/bbbp_sample.csv data/raw/bbbp.csv python -m src.pipelines.bbb_pipeline # 4. Inspect the output at data/processed/bbbp_features.parquet python -c "import pandas as pd; df = pd.read_parquet('data/processed/bbbp_features.parquet'); print(df.shape, df.dtypes.head())" ``` Result lives at `data/processed/bbbp_features.parquet`. ```bash # Smoke-test the EEG pipeline with the bundled fixture (5 ch synthetic .fif) mkdir -p data/raw cp tests/fixtures/eeg_sample.fif data/raw/eeg.fif python -m src.pipelines.eeg_pipeline ``` Result lives at `data/processed/eeg_features.parquet`. ```bash # Smoke-test the MRI pipeline with the bundled fixture (6 subjects Γ— 2 sites) mkdir -p data/raw/mri cp tests/fixtures/mri_sample/* data/raw/mri/ python -m src.pipelines.mri_pipeline ``` Result lives at `data/processed/mri_features.parquet` (48 ROI features per subject, ComBat-harmonized across sites). > **Real BBBP data:** not bundled (gitignored). Download from > [Kaggle](https://www.kaggle.com/datasets/priyanagda/bbbp) or > [MoleculeNet](https://moleculenet.org/datasets-1); place as `data/raw/bbbp.csv`. ### Train the downstream BBB model (one-time) ```bash python -m src.pipelines.bbb_pipeline # produces data/processed/bbbp_features.parquet python -m src.models.bbb_model # produces data/processed/bbb_model.joblib ``` Then `POST /predict/bbb` (and the Streamlit BBB tab) become live. Try: ```bash curl -s -X POST http://localhost:8000/predict/bbb \ -H 'Content-Type: application/json' \ -d '{"smiles": "CCO", "top_k": 5}' | python3 -m json.tool ``` ### Add the MRI image deep-learning model MRI deep-learning training happens outside this repository. Export the trained volumetric model to ONNX and place it at: ```text data/processed/mri_model.onnx ``` The runtime contract is: - Input file: one `.nii` / `.nii.gz` MRI volume. - Preprocess: trilinear resize to `target_shape` (default `[64, 64, 64]`), z-score normalization over non-zero voxels, then tensor shape `[1, 1, D, H, W]`. - ONNX output: one class vector `[1, C]`, either logits or probabilities. - Override artifact path with `MRI_MODEL_PATH=/path/to/model.onnx`. Try the endpoint after adding the artifact: ```bash curl -s -X POST http://localhost:8000/predict/mri \ -H 'Content-Type: application/json' \ -d '{ "input_path": "tests/fixtures/mri_sample/subject_0.nii.gz", "target_shape": [64, 64, 64], "label_names": ["control", "abnormal"] }' | python3 -m json.tool ``` If the ONNX artifact is missing, the endpoint returns HTTP 503 with a remediation hint instead of crashing. ### Run the full stack with Docker ```bash docker compose up ``` Then browse to: - **FastAPI Swagger** β€” - **Streamlit dashboard** β€” `streamlit run src/frontend/app.py` (port 8501; not in compose by default) - **MLflow UI** β€” Live-demo robustness: if the MLflow service is unreachable, set `NEUROBRIDGE_DISABLE_MLFLOW=1` to make the pipelines run without tracking. The container startup script also protects local demos with a mounted `./data` directory: if the host volume is empty, it seeds fixture data, trains the BBB model artifact, and builds the RAG FAISS index before launching the app. ## Runtime Configuration | Variable | Purpose | |---|---| | `BBB_MODEL_PATH` | Override the BBB joblib artifact path (`data/processed/bbb_model.joblib`). | | `MRI_MODEL_PATH` | Override the MRI ONNX artifact path (`data/processed/mri_model.onnx`). | | `OPENROUTER_API_KEY` | Enables LLM explainer and orchestrator agent calls through OpenRouter. | | `OPENROUTER_FREE_MODELS` | Optional comma-separated fallback chain for the explainer. | | `NEUROBRIDGE_AGENT_MODEL` | OpenRouter model id for `/agent/run`. | | `NEUROBRIDGE_DISABLE_LLM=1` | Forces deterministic template explanations. | | `NEUROBRIDGE_DISABLE_MLFLOW=1` | Skips MLflow tracking/lookups when the tracking service is unavailable. | ## Repository Layout ```text . β”œβ”€β”€ AGENTS.md # Project contract (vision, layout, code & data rules) β€” read first β”œβ”€β”€ README.md # this file β”œβ”€β”€ requirements.txt # Pinned deps; Python 3.10–3.12 only β”œβ”€β”€ .python-version # 3.12 β”œβ”€β”€ pytest.ini β”œβ”€β”€ data/ β”‚ β”œβ”€β”€ raw/ # vendor inputs (CSV / EDF / NIfTI); gitignored β”‚ └── processed/ # Parquet outputs from pipelines; gitignored β”œβ”€β”€ docs/superpowers/plans/ # Per-day implementation plans β”œβ”€β”€ src/ β”‚ β”œβ”€β”€ core/ # logger, deterministic storage, MLflow tracking β”‚ β”œβ”€β”€ pipelines/ β”‚ β”‚ β”œβ”€β”€ bbb_pipeline.py # Day-1 pipeline (4 public funcs + CLI entry) β”‚ β”‚ β”œβ”€β”€ eeg_pipeline.py # Day-2 pipeline (6 public funcs + CLI entry) β”‚ β”‚ └── mri_pipeline.py # Day-3 pipeline (5 public funcs + CLI entry) β”‚ β”œβ”€β”€ models/ β”‚ β”‚ β”œβ”€β”€ bbb_model.py # RandomForest BBB classifier + SHAP β”‚ β”‚ └── mri_model.py # External ONNX MRI inference surface β”‚ β”œβ”€β”€ rag/ # fastembed + FAISS ingest/retrieve layer β”‚ β”œβ”€β”€ agents/ # OpenRouter orchestrator + guarded routing + tools β”‚ β”œβ”€β”€ llm/ # LLM/template explanation surface β”‚ β”œβ”€β”€ api/ # FastAPI routes + schemas β”‚ └── frontend/ # Streamlit dashboard └── tests/ β”œβ”€β”€ core/, pipelines/, models/, rag/, agents/ └── fixtures/ # bbbp_sample.csv, eeg_sample.fif, mri_sample/ + build_*.py ``` ## BBB Pipeline (Day 1) | Function | Purpose | |----------|---------| | `is_valid_smiles(smiles)` | Returns `True` iff the input is a non-empty SMILES that RDKit can parse. Handles `None`, `NaN`, and garbage strings. | | `compute_morgan_fingerprint(smiles, n_bits, radius)` | Returns a `(n_bits,)` `uint8` numpy array using the modern `MorganGenerator` API. | | `extract_features_from_dataframe(df, smiles_col, n_bits, radius)` | Drops invalid rows (logged WARNING with truncated index list), expands fingerprints into `fp_0..fp_{n-1}` columns, preserves metadata. Returns a model-ready `pd.DataFrame`. | | `run_pipeline(input_path, output_path, smiles_col, n_bits, radius)` | End-to-end CSV β†’ Parquet orchestrator. Idempotent; raises on missing input or directory output. | All four functions log via `src.core.logger.get_logger(__name__)` per AGENTS.md Β§3 and satisfy the Β§4 Data Readiness contract (5 invariants: schema validity, domain validity, determinism, traceability, idempotence). ## EEG Pipeline (Day 2) | Function | Purpose | |---|---| | `is_valid_epoch(epoch)` | Returns True iff the input is a finite, numeric, non-empty 2-D array. Rejects NaN/inf, non-numeric dtypes, lists/scalars. | | `bandpass_filter(raw, l_freq, h_freq)` | Non-mutating MNE bandpass (default 1–40 Hz). Raises ValueError on inverted frequency range. | | `remove_artifacts_with_ica(raw, eog_ch_name, n_components, random_state)` | Seeded ICA + correlation-based EOG component rejection. Skips gracefully (no-op + WARNING) on missing/typo EOG channel or NaN-contaminated data. | | `compute_features_from_epoch(epoch, sfreq)` | Per-channel PSD bands (delta/theta/alpha/beta/gamma) + 5 statistical moments (mean/std/var/skew/kurtosis). Constant-channel safe (NaN-cleaned). | | `extract_features_from_recording(raw, epoch_duration_s, eog_ch_name, n_components, random_state)` | Chains filter β†’ ICA β†’ epoching β†’ feature extraction. Drops invalid epochs (logged WARNING with truncated index list). Returns 2-D `pd.DataFrame` with deterministic `feat__psd_` and `feat__` columns. | | `run_pipeline(input_path, output_path, ...)` | End-to-end FIF/EDF β†’ Parquet orchestrator. Idempotent; raises on missing input or directory output. | The pipeline is seeded (`random_state=97`) and produces byte-identical Parquet output for the same input β€” satisfying the Β§4 Determinism contract. Output is float64, preserved through the Parquet round-trip. ## MRI Pipeline (Day 3) | Function | Purpose | |---|---| | `is_valid_volume(volume)` | Returns True iff input is a finite, numeric, non-empty 3-D ndarray. Rejects NaN/inf, non-numeric dtypes, lists/scalars. | | `mask_brain(volume, intensity_threshold)` | Two-step brain mask: intensity threshold (default = volume mean) + 6-connectivity morphological opening to drop isolated noise voxels. WARNs if mask is empty. | | `extract_features_from_volume(volume, mask, n_roi_axes)` | Partitions the masked volume into `prod(n_roi_axes)` axis-aligned octants (default 2Γ—2Γ—2 = 8) and emits 6 stats per ROI: mean / std / p10 / p50 / p90 / voxel_count. Empty ROIs β†’ 0.0 (no NaN). Single source of truth via `_ROI_STATS_FUNCS`. | | `harmonize_combat(features, sites, feature_cols)` | Wraps `neuroHarmonize.harmonizationLearn` with `np.round(14)` defensive determinism boundary. Removes site-level domain shift on the named columns. Raises if <2 sites or empty `feature_cols` or row/site length mismatch. | | `run_pipeline(input_dir, sites_csv, output_path, ...)` | End-to-end NIfTI directory β†’ ComBat-harmonized Parquet orchestrator. Drops invalid volumes with logged WARNING. Splits feature columns on a `_MIN_VAR_THRESHOLD = 1e-8` variance floor (constant columns bypass ComBat to avoid NaN). Idempotent; raises on missing input or directory output. | Output schema: one row per surviving subject with columns `subject_id, site, feat_roi{i}_` (8 ROIs Γ— 6 stats = 48 features). All `feat_*` are float64 (preserved through the Parquet round-trip). ## MRI Image Model `src/models/mri_model.py` is intentionally separate from `mri_pipeline.py`. The pipeline remains the deterministic ComBat feature-preparation surface. The image model is a decision layer for externally-trained volumetric DL models: | Function | Purpose | |---|---| | `load(path)` | Loads an ONNX artifact with `onnxruntime` CPU execution. | | `load_nifti_volume(path)` | Reads one `.nii` / `.nii.gz` volume as `float32`. | | `preprocess_volume(volume, target_shape)` | Validates 3-D finite data, resizes, z-scores, returns `[1, 1, D, H, W]`. | | `predict_nifti(model, input_path, target_shape, label_names)` | Runs preprocessing + ONNX inference and returns label, confidence, probabilities. | Public API: `POST /predict/mri`. Streamlit exposes it in the Image tab under "MRI Image Model". The trained artifact is not committed; put it in `data/processed/mri_model.onnx` or set `MRI_MODEL_PATH`. ## Storage Format Pipeline outputs are written as Parquet files using the `pyarrow` engine with snappy compression. This preserves dtypes (`uint8` fingerprint columns stay `uint8` instead of widening to `int64` as CSV would do) and yields ~10Γ— smaller files than CSV β€” material for the `float64` EEG features Day 2 produces. See AGENTS.md Β§6. ## Testing & TDD All pipeline functions and the shared logger were built TDD-first across Days 1–3 (RED β†’ GREEN β†’ REFACTOR). Each task ended in a green commit; review-and-fix loops landed as separate commits with `fix:` / `refactor:` prefixes. Run `pytest -v` at any time. Current verification on Windows/Python 3.11: `242 passed, 2 skipped`. ## Roadmap - **Day 2 (shipped):** `eeg_pipeline.py` β€” bandpass + MNE ICA artifact removal + PSD + statistical features β†’ Parquet. - **Day 3 (shipped):** `mri_pipeline.py` β€” NIfTI volume loading, brain masking, ROI feature extraction, ComBat harmonization (`neuroHarmonize`) for site-level domain shift β†’ Parquet. - **Day 4 (shipped):** FastAPI surface in `src/api/` (POST `/pipeline/{bbb,eeg,mri}` + `/health`), MLflow experiment tracking via `src.core.tracking` (see AGENTS.md Β§7), Streamlit dashboard at `src/frontend/app.py`, and Docker / `docker-compose.yml` for the api + MLflow stack. - **Day 5 (shipped):** Decision layer in `src/models/bbb_model.py` β€” RandomForest BBB classifier on Morgan fingerprints, SHAP top-k explanations, `POST /predict/bbb` endpoint, interactive Streamlit BBB tab with SMILES input + decision card + SHAP bar chart, and trainer CLI (`python -m src.models.bbb_model`). See AGENTS.md Β§8. - **Day 6 (shipped):** Final polish & demo features β€” calibration metadata bins on the BBB classifier (precision-at-confidence in `BBBPredictResponse.calibration`), edge-case dropdown in the Streamlit BBB tab (5 curated robustness probes), trust caption on the decision card, and `POST /pipeline/mri/diagnostics` returning Pre/Post ComBat long-format data + site-gap KPIs visualized as a faceted altair KDE in the MRI tab. See AGENTS.md Β§8 (calibration) + Β§9 (demo features). - **Post-Day-8 hardening (shipped):** Orchestrator workflow guard enforces pipeline β†’ RAG β†’ synthesis even when the LLM skips tool calls; Docker startup guard rebuilds missing demo artifacts behind a mounted `data/`; Windows-safe MLflow test URI; MRI ONNX image decision layer at `POST /predict/mri` β€” 242 passed, 2 skipped. ## Where to Look - **Project rules (mandatory reading for any agent):** [`AGENTS.md`](AGENTS.md) - **Day-1 plan (full TDD task breakdown):** [`docs/superpowers/plans/2026-04-29-neurobridge-day1-bootstrap-bbb-pipeline.md`](docs/superpowers/plans/2026-04-29-neurobridge-day1-bootstrap-bbb-pipeline.md) - **Day-2 plan (full TDD task breakdown):** [`docs/superpowers/plans/2026-04-30-day2-eeg-mne-ica-pipeline.md`](docs/superpowers/plans/2026-04-30-day2-eeg-mne-ica-pipeline.md) - **Logger contract:** [`src/core/logger.py`](src/core/logger.py) + [`tests/core/test_logger.py`](tests/core/test_logger.py) - **BBB pipeline:** [`src/pipelines/bbb_pipeline.py`](src/pipelines/bbb_pipeline.py) + [`tests/pipelines/test_bbb_pipeline.py`](tests/pipelines/test_bbb_pipeline.py) - **EEG pipeline:** [`src/pipelines/eeg_pipeline.py`](src/pipelines/eeg_pipeline.py) + [`tests/pipelines/test_eeg_pipeline.py`](tests/pipelines/test_eeg_pipeline.py) - **Day-3 plan (full TDD task breakdown):** [`docs/superpowers/plans/2026-05-01-day3-mri-combat-pipeline.md`](docs/superpowers/plans/2026-05-01-day3-mri-combat-pipeline.md) - **MRI pipeline:** [`src/pipelines/mri_pipeline.py`](src/pipelines/mri_pipeline.py) + [`tests/pipelines/test_mri_pipeline.py`](tests/pipelines/test_mri_pipeline.py) - **Day-4 plan (full TDD task breakdown):** [`docs/superpowers/plans/2026-05-02-day4-api-mlops-frontend.md`](docs/superpowers/plans/2026-05-02-day4-api-mlops-frontend.md) - **Shared core helpers:** [`src/core/determinism.py`](src/core/determinism.py), [`src/core/storage.py`](src/core/storage.py), [`src/core/tracking.py`](src/core/tracking.py) - **FastAPI surface:** [`src/api/main.py`](src/api/main.py), [`src/api/routes.py`](src/api/routes.py), [`src/api/schemas.py`](src/api/schemas.py) - **Streamlit dashboard:** [`src/frontend/app.py`](src/frontend/app.py) - **Container stack:** [`Dockerfile`](Dockerfile), [`docker-compose.yml`](docker-compose.yml) - **Day-4 tests:** [`tests/api/`](tests/api/), [`tests/frontend/`](tests/frontend/), [`tests/pipelines/test_cross_pipeline_smoke.py`](tests/pipelines/test_cross_pipeline_smoke.py) - **Day-5 plan (full TDD task breakdown):** [`docs/superpowers/plans/2026-05-03-day5-downstream-model-xai-interactive.md`](docs/superpowers/plans/2026-05-03-day5-downstream-model-xai-interactive.md) - **BBB downstream model (classifier + SHAP explainer + trainer CLI):** [`src/models/bbb_model.py`](src/models/bbb_model.py) + [`tests/models/test_bbb_model.py`](tests/models/test_bbb_model.py) - **MRI image DL decision layer:** [`src/models/mri_model.py`](src/models/mri_model.py) + [`tests/models/test_mri_model.py`](tests/models/test_mri_model.py); `POST /predict/mri` consumes an externally-trained ONNX artifact at `data/processed/mri_model.onnx` (`MRI_MODEL_PATH` override). - **Day-6 plan (full TDD task breakdown):** [`docs/superpowers/plans/2026-05-04-day6-final-polish-demo-features.md`](docs/superpowers/plans/2026-05-04-day6-final-polish-demo-features.md) - **MRI ComBat diagnostics surface (pre/post site-gap KPIs):** `POST /pipeline/mri/diagnostics` β€” see [`src/api/routes.py`](src/api/routes.py) + [`src/pipelines/mri_pipeline.py`](src/pipelines/mri_pipeline.py) - **Day-7 design spec:** [`docs/superpowers/specs/2026-05-05-day7-drift-traceability-agents-design.md`](docs/superpowers/specs/2026-05-05-day7-drift-traceability-agents-design.md) - **Day-7 plan (full TDD task breakdown):** [`docs/superpowers/plans/2026-05-05-day7-drift-traceability-agents.md`](docs/superpowers/plans/2026-05-05-day7-drift-traceability-agents.md) - **New surface:** `POST /explain/bbb` β€” natural-language rationale (LLM + deterministic fallback) - **New surface:** `drift_z` / `rolling_n` / `provenance` fields in `POST /predict/bbb` response - **Day-8 plan (full TDD task breakdown):** [`docs/superpowers/plans/2026-05-06-day8-grand-finale.md`](docs/superpowers/plans/2026-05-06-day8-grand-finale.md) - **New surfaces:** `POST /explain/eeg`, `POST /explain/mri`, `GET /experiments/runs`, `POST /experiments/diff` - **New deploy artifacts:** `Dockerfile.hf`, `supervisord.conf` - **LLM hardening (post-Day 8):** real OpenRouter LLM is now the default in deployed Spaces β€” `Dockerfile`/`Dockerfile.hf` no longer hard-code `NEUROBRIDGE_DISABLE_LLM=1`. Free-tier fallback chain (10 models, smartest β†’ smallest) in [`src/llm/explainer.py`](src/llm/explainer.py), 401/400 status classification, and language-matching / intent-split prompt. Diagnostic endpoint `GET /diag/openrouter` ([`src/api/main.py`](src/api/main.py)) + Streamlit sidebar "πŸ”§ Diagnose LLM" button. Live verification helper: [`scripts/diagnose_openrouter.py`](scripts/diagnose_openrouter.py). - **Orchestrator agent (Task 13):** [`src/agents/orchestrator.py`](src/agents/orchestrator.py), [`src/agents/routing.py`](src/agents/routing.py), [`src/agents/tools.py`](src/agents/tools.py), [`src/agents/prompts.py`](src/agents/prompts.py). Guarded workflow enforces one pipeline tool, then `retrieve_context`, then final synthesis. - **RAG layer:** [`src/rag/`](src/rag/) β€” chunker, embedder (fastembed), FAISS store, retriever, ingest CLI - **Agent endpoint:** `POST /agent/run` (orchestrator + RAG); diagnostic at `GET /diag/agent` - **Streamlit Agent tab:** "πŸ€– Agent" tab in [`src/frontend/app.py`](src/frontend/app.py) β€” input box + optional MRI `sites_csv` + decision-trace expander. - **RAG knowledge base:** drop `.md`/`.pdf` into [`data/knowledge_base/`](data/knowledge_base/) β€” see its README ## Day 7 β€” Demo Recipe Pre-flight (one terminal): ```bash # Start API. With OPENROUTER_API_KEY set in your shell or .env, # /explain/* hits the real LLM via the free-tier fallback chain # (10 models, smartest β†’ smallest β€” see AGENTS.md Β§11). Without # a key, falls back to the deterministic template. BBB_MODEL_PATH=data/processed/bbb_model.joblib \ uvicorn src.api.main:app --port 8000 # Force the deterministic template path (no network, fully reproducible): # NEUROBRIDGE_DISABLE_LLM=1 BBB_MODEL_PATH=... uvicorn ... ``` Predict + explain (other terminal): ```bash # 1) Predict β€” body now carries drift_z, rolling_n, provenance curl -s -X POST http://localhost:8000/predict/bbb \ -H "Content-Type: application/json" \ -d '{"smiles": "CCO", "top_k": 5}' | jq # 2) Explain β€” feed the predict response back as the explain payload. # user_question drives the prompt: question language is mirrored # (Turkish question β†’ Turkish answer), and the model answers the # question directly instead of returning a canned paper summary. curl -s -X POST http://localhost:8000/explain/bbb \ -H "Content-Type: application/json" \ -d '{ "smiles": "CCO", "label": 1, "label_text": "permeable", "confidence": 0.82, "top_features": [ {"feature": "fp_341", "shap_value": 0.045}, {"feature": "fp_902", "shap_value": -0.031} ], "drift_z": 0.42, "user_question": "Why permeable?" }' | jq # With a valid key: expect "source": "llm" + a model id from the chain. # Without: expect "source": "template" + "model": null. # 3) Diagnose OpenRouter reachability from inside the running API # (key presence, chain head, 8-token probe). Surfaced in Streamlit # as the sidebar "πŸ”§ Diagnose LLM" button. curl -s http://localhost:8000/diag/openrouter | jq ``` Streamlit demo: `streamlit run src/frontend/app.py` β†’ BBB tab β†’ Predict β†’ AI Assistant tab β†’ ask a preset question. Drift demo: refresh the BBB tab and predict 10+ times in a row β€” the drift caption transitions from "warming up" to a numeric z-score. ## Demo Scripts ### 90-Second Jury Tour Choreography for the live demo. Click order matters; every claim has a numeric receipt visible on screen. | t | Tab | Action | Talking point | |---|---|---|---| | 0:00 | (open) | `streamlit run src/frontend/app.py` already launched | "This is NeuroBridge Enterprise β€” three modalities behind one decision system." | | 0:05 | **BBB** | Pick "Custom input" β†’ enter `CCO` β†’ click Predict | Show label + 82% confidence progress bar. | | 0:15 | (same) | Read calibration caption | "Predictions β‰₯80% confident are correct 92% of the time on held-out data β€” n=18." | | 0:22 | (same) | Read drift caption | "Trailing-100 confidence median is +0.42Οƒ from train β€” within expected range." | | 0:30 | (same) | Read provenance badge | "MLflow run `abc123`, Model v1, n=1640 examples β€” full audit trail." | | 0:35 | (same) | Switch to "Massive OOD: cyclosporine-like macrocycle" β†’ Predict | "Cyclosporine has 11 residues, ~1.2 kDa β€” way outside training distribution." | | 0:45 | (same) | Read confidence + drift | "System knows what it doesn't know β€” confidence drops, drift signal flags it." | | 0:55 | **AI Assistant** | Pick preset "Why was this molecule predicted as permeable?" β†’ Ask | "LLM rationale uses SHAP attributions + drift context β€” auditable source label." | | 1:10 | **MRI** | Click "Run ComBat diagnostics" | Show 3-metric strip: Pre 5.0 β†’ Post 0.0015 β†’ 3290Γ— reduction. | | 1:20 | (same) | Point to faceted KDE | "Each color is a hospital. Pre-ComBat panels diverge; Post panels converge." | | 1:30 | **Experiments** | Switch tabs, show MLflow runs table | "Every train run is logged; pick any two for a metric/param diff." | ### 30-Second Drift Detection Show Standalone demo of the "Adapt Over Time" capability. | t | Action | What jury sees | |---|---|---| | 0:00 | Open BBB tab. | Drift caption shows "warming up (0/10 predictions buffered)". | | 0:05 | Hit Predict 10Γ— rapidly with the same SMILES (`CCO`). | After predict #10, drift caption switches to a numeric z-score. | | 0:18 | Switch to "Cyclosporine OOD" β†’ predict 3Γ— more. | Drift z-score rises in magnitude; if `|z|β‰₯1`, caption shows "mild distribution shift"; if `|z|β‰₯2`, "significant shift, retrain recommended". | | 0:30 | Conclude. | "The system is online-aware β€” it doesn't just predict, it tells you when its own predictions are drifting from the world it was trained on." |