Spaces:

mekosotto
/

hackathon

Running

mekosotto Claude Opus 4.7 (1M context) commited on 5 days ago

Commit

270f76f

1 Parent(s): ac781dd

docs(plan): external assets integration (MRI DL 2D + TF-IDF RAG + OASIS tabular)

Roadmap + three sub-plans for integrating user-supplied external assets:

1. MRI DL 2D — pretrained resnet18 4-class Alzheimer's classifier from
user's training run (BEST_PARAMS: image_size=160, lr=3.75e-4,
weight_decay=1.96e-4, dropout=0.31). Adds src/models/mri_dl_2d.py
parallel to volumetric ONNX path, dispatched via MRI_MODEL_KIND env.
2. TF-IDF clinical RAG — 14 medical PDFs (Alzheimer/Parkinson/lifestyle)
with Turkish+English query expansion. Wraps user's pre-built sklearn
TF-IDF index as src/rag/clinical/. Existing FAISS RAG kept.
3. OASIS tabular classifier — sklearn RF on OASIS longitudinal biomarkers
(MMSE/eTIV/nWBV/ASF/...). NOTE: user described notebook as 'EEG model'
but it is OASIS tabular. Plan flags this prominently with branch 3a
(default: integrate as fusion modality) vs 3b (await real EEG model).

All three plans flag prerequisite blockers (artifact transfer, dataset
acquisition) and preserve independence guarantees from the clinical
platform roadmap. Each ends with subagent-driven-development handoff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (4) hide show

docs/superpowers/plans/2026-05-02-external-assets-integration-roadmap.md +98 -0
docs/superpowers/plans/2026-05-02-mri-dl-2d-integration.md +625 -0
docs/superpowers/plans/2026-05-02-oasis-tabular-fusion-integration.md +641 -0
docs/superpowers/plans/2026-05-02-tfidf-rag-integration.md +653 -0

docs/superpowers/plans/2026-05-02-external-assets-integration-roadmap.md ADDED Viewed

	@@ -0,0 +1,98 @@

+# External Assets Integration — Roadmap
+> **For agentic workers:** Index of three sub-plans for integrating the user's external assets. Each sub-plan executes via `superpowers:subagent-driven-development` (recommended) or `superpowers:executing-plans`.
+**Vision.** The user has supplied three external assets that should replace or extend our current placeholders:
+| Asset | What it is | Replaces / extends |
+|---|---|---|
+| Pretrained MRI 2D classifier | PyTorch resnet18 trained on Kaggle's 4-class Alzheimer's MRI dataset (`MildDemented` / `ModerateDemented` / `NonDemented` / `VeryMildDemented`) | The dummy ONNX model in `tests/fixtures/build_dummy_mri_onnx.py`; the placeholder behaviour in `src/models/mri_model.py` |
+| TF-IDF RAG corpus | 14 medical PDFs (Alzheimer + Parkinson + lifestyle/nutrition/exercise) with a pre-built TF-IDF index and Turkish query expansion | The existing FAISS+fastembed RAG in `src/rag/` (or runs alongside it) |
+| OASIS tabular classifier (ipynb) | sklearn ensemble on OASIS longitudinal biomarkers (MMSE, eTIV, nWBV, ASF, …) | **Not an EEG model** — see sub-plan #3 for two routing options |
+---
+## Sub-projects
+| # | Sub-plan file | Owner concern | Depends on | Demo on its own? |
+|---|---|---|---|---|
+| 1 | `2026-05-02-mri-dl-2d-integration.md` | Real MRI deep-learning model in production path | — (parallel to fusion) | yes (Streamlit + curl) |
+| 2 | `2026-05-02-tfidf-rag-integration.md` | Lifestyle / clinical-paper RAG with Turkish support | — | yes (CLI + agent tool) |
+| 3 | `2026-05-02-oasis-tabular-fusion-integration.md` | Tabular OASIS classifier as a fusion-engine feature **OR** wait for a real EEG model | fusion engine (#1 of clinical-platform-roadmap) | yes (POST /fusion/predict) |
+---
+## Build order
+```
+        ┌────────────────────────────┐      ┌────────────────────────────┐
+        │ #1 MRI DL 2D integration   │      │ #2 TF-IDF RAG integration  │
+        │ (independent)              │      │ (independent)              │
+        └─────────────┬──────────────┘      └────────────┬───────────────┘
+                      │                                  │
+                      └──────────────┬───────────────────┘
+                                     │
+                              ┌──────▼─────────┐
+                              │ #3 OASIS       │
+                              │ classifier as  │
+                              │ fusion feature │
+                              └────────────────┘
+```
+#1 and #2 can be built in parallel (different files). #3 should follow once both are stable so the demo flows end-to-end.
+---
+## Open prerequisites (user must resolve)
+These are **not** dev gaps — they are inputs we need from outside this codebase. Each sub-plan calls them out explicitly in its preamble, but listing here so they are in one place.
+### A. MRI checkpoint file is not on this machine
+The user said the artifact lives at `outputs\checkpoints\best_model.pt` (Windows-style path). `find /Users/mertgungor` returns no `best_model.pt`. Sub-plan #1 cannot start until the file is at `data/processed/mri_dl_2d/best_model.pt` (gitignored — never commit a model binary). Confirm class index order matches the trainer:
+```python
+CLASS_TO_IDX = {
+    "MildDemented": 0,
+    "ModerateDemented": 1,
+    "NonDemented": 2,
+    "VeryMildDemented": 3,
+}
+```
+If the trainer used a different ordering (`ImageFolder` alphabetises by default), the labels we surface will be wrong. Sub-plan #1 ships a sanity test that catches this.
+### B. The "EEG ipynb" is OASIS tabular, not EEG
+`/Users/mertgungor/Downloads/rag/detecting-early-alzheimer-s (1).ipynb` trains an sklearn ensemble (LogReg / SVM / DT / RF / AdaBoost) on the OASIS longitudinal MRI **tabular** dataset (`oasis_longitudinal.csv` — MMSE, eTIV, nWBV, ASF, EDUC, SES, …). It contains zero EEG signal processing and saves no model artifact.
+Sub-plan #3 has **two branches**:
+- **Branch 3a (default).** Treat the OASIS biomarker model as a clinical-tests extension to the fusion engine (already accepts MMSE etc. as features — this just adds eTIV/nWBV/ASF and re-runs the trained sklearn model in-process).
+- **Branch 3b.** If the user has a real EEG model elsewhere (a checkpoint file that consumes raw FIF / EDF data and emits class probabilities), the user must point us to it and we re-scope sub-plan #3 around that artifact.
+The user must pick the branch before sub-plan #3 starts.
+### C. RAG corpus location
+The new RAG lives at `/Users/mertgungor/Downloads/rag/`. It must be copied into the repo at `data/external_rag/` (or a symlink — but symlinks break in Docker). The pre-built `index/rag_index.pkl` is 12.9 MB — gitignore the binary, commit only the source PDFs (or, for hackathon speed, gitignore both and document the manual copy step). Sub-plan #2 commits the wrapper code and a small fixture copy of one PDF for tests; the full corpus stays out of git.
+---
+## Decoupling guarantees (carry forward from clinical-platform-roadmap.md)
+Independence rules from the existing roadmap apply unchanged. Specifically:
+- **MRI DL 2D model is a swap-in for the existing 3D ONNX path.** The `src/models/mri_model.py` API surface stays stable; the new module sits alongside as `src/models/mri_dl_2d.py` and is selected via env var (`MRI_MODEL_KIND=resnet18_2d` vs. `volumetric_onnx`). Pipelines that don't load it must continue to work.
+- **TF-IDF RAG and FAISS RAG run side-by-side.** The agent tool `retrieve_context` is widened to accept a `corpus` parameter (`"clinical"` for new TF-IDF, `"reference"` for existing FAISS). Existing tests stay green.
+- **BBB stays decoupled.** Same rule from the fusion plan: no sub-plan here introduces a BBB↔MRI hard dependency.
+---
+## "When am I done?" gates (apply to every sub-plan)
+1. All TDD tasks committed.
+2. Full test suite passes (current baseline: 295 passed, 1 skipped).
+3. Feature reachable end-to-end: Streamlit UI **OR** curl `/predict/mri` / `/agent/run` / `/fusion/predict` works.
+4. README has a one-paragraph note describing the new asset and how to swap it out.
+5. Final code-reviewer subagent verdict: "Ready to merge".

docs/superpowers/plans/2026-05-02-mri-dl-2d-integration.md ADDED Viewed

	@@ -0,0 +1,625 @@

+# MRI DL 2D Integration Plan
+> **For agentic workers:** REQUIRED SUB-SKILL: `superpowers:subagent-driven-development` (recommended). TDD throughout — failing test → minimal impl → passing test → commit.
+**Goal.** Wire the user's pretrained PyTorch resnet18 (2D image, 4-class Alzheimer's: `MildDemented` / `ModerateDemented` / `NonDemented` / `VeryMildDemented`) into the production decision layer alongside the existing volumetric ONNX path. The model produces a probability vector that flows naturally into the fusion engine.
+**Architecture.** Add `src/models/mri_dl_2d.py` parallel to the existing `src/models/mri_model.py`. A small selector picks between paths based on `MRI_MODEL_KIND` env var (`resnet18_2d` or `volumetric_onnx`). The 2D model loads a `state_dict` `.pt` checkpoint, applies the resnet18 preprocessing contract (resize 160, ImageNet normalisation), and emits `MRIPredictResponse` in the same shape the existing surface produces — so the API and frontend need no behavioural change.
+**Tech stack.** Python 3.11, PyTorch (CPU), torchvision, Pillow. PyTorch is not currently in `requirements.txt` — Task 0 adds it. No new web dependencies.
+**Trainer's hyper-parameters (for reference, not all relevant at inference):**
+```python
+BEST_PARAMS = {
+    "image_size": 160,
+    "model_name": "resnet18",
+    "optimizer": "adamw",          # only relevant for training
+    "lr": 0.000375191537539265,    # only relevant for training
+    "weight_decay": 0.000196410142442417,
+    "dropout": 0.31154239434523634,  # we apply at inference iff the trainer baked dropout into the head
+    "batch_size": 128,             # not used at inference (we infer one image at a time)
+    "epochs": 10,                  # training-only
+}
+CLASS_TO_IDX = {
+    "MildDemented": 0,
+    "ModerateDemented": 1,
+    "NonDemented": 2,
+    "VeryMildDemented": 3,
+}
+```
+---
+## Prerequisite (controller blocker)
+The artifact `best_model.pt` is **not** present on this filesystem. Before any task starts:
+1. Copy the file from the trainer machine to `data/processed/mri_dl_2d/best_model.pt`.
+2. Confirm with `python -c "import torch; sd = torch.load('data/processed/mri_dl_2d/best_model.pt', map_location='cpu'); print(type(sd), list(sd.keys())[:5] if isinstance(sd, dict) else sd)"`. Two possible structures:
+   - **`state_dict` only** (most common): `dict[str, Tensor]`. Task 1 builds the resnet18 architecture and `load_state_dict`s.
+   - **Full model** (`torch.save(model, ...)`): a pickled `nn.Module`. Task 1 just calls `torch.load(...)`.
+   - The plan defaults to **state_dict** (more portable). If the file turns out to be a full model, Task 1 has a fallback branch.
+3. Add the artifact path to `.gitignore` if it isn't already covered (`data/processed/` should already be ignored — verify).
+If step 2 fails, **stop and surface to the user** — the trainer either produced a different artifact or saved with an unexpected structure.
+---
+## File structure
+| Path | Responsibility |
+|---|---|
+| Modify `requirements.txt` | add `torch`, `torchvision`, `pillow` (CPU wheels are fine) |
+| Create `src/models/mri_dl_2d.py` | resnet18 4-class loader + preprocessing + `predict_image()` |
+| Create `src/models/mri_selector.py` | tiny dispatcher: `load_default()` / `predict_default()` based on `MRI_MODEL_KIND` env |
+| Modify `src/api/routes.py` | `predict_mri` chooses between volumetric and 2D paths via the selector |
+| Modify `src/api/schemas.py` | `MRIPredictRequest.input_path` now accepts `.png/.jpg/.nii*` (already string) — no schema change beyond a docstring tweak |
+| Create `tests/fixtures/build_dummy_resnet18_2d.py` | helper that constructs a randomly-initialised 4-class resnet18 and saves a state_dict to a tmp path so tests don't need the real artifact |
+| Create `tests/models/test_mri_dl_2d.py` | unit tests for the new module |
+| Create `tests/api/test_mri_2d_route.py` | integration test through `POST /predict/mri` |
+| Modify `README.md` | document the env var and the artifact location |
+---
+## Tasks
+### Task 0: Dependencies
+**Files:**
+- Modify: `requirements.txt`
+- [ ] **Step 1:** open `requirements.txt`, append (CPU wheels — torch is large, target ~200 MB):
+```
+torch>=2.2,<3.0
+torchvision>=0.17,<1.0
+pillow>=10.0,<12.0
+```
+- [ ] **Step 2:** install: `pip install torch torchvision pillow`. Verify import: `python -c "import torch, torchvision; print(torch.__version__, torchvision.__version__)"`. Expect a version line, no error.
+- [ ] **Step 3:** run `pytest -q` — expect the existing 295+1 baseline. No regressions before any code changes.
+- [ ] **Step 4:** commit: `git commit -m "deps: add torch/torchvision/pillow for MRI DL 2D"`.
+---
+### Task 1: 2D model loader + preprocessing
+**Files:**
+- Create: `src/models/mri_dl_2d.py`
+- Create: `tests/fixtures/build_dummy_resnet18_2d.py`
+- Create: `tests/models/test_mri_dl_2d.py`
+- [ ] **Step 1: Write the dummy-checkpoint fixture (so tests don't need the real .pt).**
+`tests/fixtures/build_dummy_resnet18_2d.py`:
+```python
+"""Build a randomly-initialised 4-class resnet18 state_dict for tests."""
+from __future__ import annotations
+from pathlib import Path
+import torch
+from torchvision import models
+def build(path: Path) -> Path:
+    """Save a state_dict at `path` and return the path. Idempotent."""
+    path = Path(path)
+    if path.exists():
+        return path
+    path.parent.mkdir(parents=True, exist_ok=True)
+    model = models.resnet18(weights=None)
+    model.fc = torch.nn.Linear(model.fc.in_features, 4)
+    torch.save(model.state_dict(), str(path))
+    return path
+```
+- [ ] **Step 2: Write the failing test.**
+`tests/models/test_mri_dl_2d.py`:
+```python
+"""Tests for src.models.mri_dl_2d — pretrained 4-class Alzheimer's resnet18."""
+from __future__ import annotations
+from pathlib import Path
+import numpy as np
+import pytest
+import torch
+from PIL import Image
+from src.models import mri_dl_2d
+from tests.fixtures.build_dummy_resnet18_2d import build as build_dummy_2d
+def _png(path: Path, size: tuple[int, int] = (200, 200)) -> Path:
+    arr = (np.random.RandomState(0).rand(size[1], size[0], 3) * 255).astype(np.uint8)
+    Image.fromarray(arr, mode="RGB").save(str(path))
+    return path
+class TestMRIDL2D:
+    def test_class_to_idx_matches_trainer(self) -> None:
+        assert mri_dl_2d.CLASS_TO_IDX == {
+            "MildDemented": 0,
+            "ModerateDemented": 1,
+            "NonDemented": 2,
+            "VeryMildDemented": 3,
+        }
+    def test_idx_to_class_is_consistent(self) -> None:
+        for name, idx in mri_dl_2d.CLASS_TO_IDX.items():
+            assert mri_dl_2d.IDX_TO_CLASS[idx] == name
+    def test_load_missing_artifact_raises(self, tmp_path: Path) -> None:
+        with pytest.raises(FileNotFoundError, match="MRI 2D checkpoint not found"):
+            mri_dl_2d.load(tmp_path / "nope.pt")
+    def test_predict_image_returns_full_probs(self, tmp_path: Path) -> None:
+        ckpt = build_dummy_2d(tmp_path / "best.pt")
+        model = mri_dl_2d.load(ckpt)
+        img = _png(tmp_path / "scan.png")
+        result = mri_dl_2d.predict_image(model, img)
+        assert set(result) == {"label", "label_text", "confidence", "probabilities"}
+        assert result["label"] in {0, 1, 2, 3}
+        assert result["label_text"] in mri_dl_2d.CLASS_TO_IDX
+        assert 0.0 <= result["confidence"] <= 1.0
+        probs = result["probabilities"]
+        assert len(probs) == 4
+        assert abs(sum(p["probability"] for p in probs) - 1.0) < 1e-5
+        # Each probability item exposes the trainer's class label, not "class_N".
+        assert {p["label_text"] for p in probs} == set(mri_dl_2d.CLASS_TO_IDX)
+    def test_predict_works_for_grayscale_input(self, tmp_path: Path) -> None:
+        ckpt = build_dummy_2d(tmp_path / "best.pt")
+        model = mri_dl_2d.load(ckpt)
+        # Single-channel grayscale, common for MRI slice exports.
+        gray = (np.random.RandomState(1).rand(180, 180) * 255).astype(np.uint8)
+        path = tmp_path / "gray.png"
+        Image.fromarray(gray, mode="L").save(str(path))
+        result = mri_dl_2d.predict_image(model, path)
+        assert 0.0 <= result["confidence"] <= 1.0
+```
+Run: `pytest tests/models/test_mri_dl_2d.py -v` → expect ImportError on `src.models.mri_dl_2d`.
+- [ ] **Step 3: Minimal implementation.**
+`src/models/mri_dl_2d.py`:
+```python
+"""Pretrained 2D MRI Alzheimer's classifier (resnet18, 4 classes).
+Decision-layer bridge for an externally-trained PyTorch checkpoint. Loads
+either a state_dict (default) or a full pickled model, applies the trainer's
+preprocessing (resize image_size=160, ImageNet normalisation), and emits the
+same dict shape as src.models.mri_model.predict_with_proba so downstream
+code paths don't care which backend produced the prediction.
+"""
+from __future__ import annotations
+from pathlib import Path
+from typing import Any
+import numpy as np
+import torch
+import torch.nn as nn
+from PIL import Image
+from torchvision import models, transforms
+from src.core.logger import get_logger
+logger = get_logger(__name__)
+CLASS_TO_IDX: dict[str, int] = {
+    "MildDemented": 0,
+    "ModerateDemented": 1,
+    "NonDemented": 2,
+    "VeryMildDemented": 3,
+}
+IDX_TO_CLASS: dict[int, str] = {v: k for k, v in CLASS_TO_IDX.items()}
+DEFAULT_IMAGE_SIZE = 160
+_IMAGENET_MEAN = (0.485, 0.456, 0.406)
+_IMAGENET_STD  = (0.229, 0.224, 0.225)
+# torchvision transform reused for every prediction. Constructed once at
+# import time — no per-call allocation.
+_TRANSFORM = transforms.Compose([
+    transforms.Resize((DEFAULT_IMAGE_SIZE, DEFAULT_IMAGE_SIZE)),
+    transforms.ToTensor(),
+    transforms.Normalize(mean=_IMAGENET_MEAN, std=_IMAGENET_STD),
+])
+def _build_resnet18_4class() -> nn.Module:
+    model = models.resnet18(weights=None)
+    model.fc = nn.Linear(model.fc.in_features, len(CLASS_TO_IDX))
+    return model
+def load(path: Path) -> nn.Module:
+    """Load checkpoint. Supports state_dict (preferred) or full pickled model."""
+    path = Path(path)
+    if not path.exists():
+        raise FileNotFoundError(f"MRI 2D checkpoint not found: {path}")
+    obj = torch.load(str(path), map_location="cpu", weights_only=False)
+    if isinstance(obj, nn.Module):
+        model = obj
+    else:
+        model = _build_resnet18_4class()
+        # Strip 'module.' prefix if the trainer used DataParallel / DDP.
+        clean = {k.removeprefix("module."): v for k, v in obj.items()}
+        model.load_state_dict(clean, strict=True)
+    model.eval()
+    return model
+def predict_image(model: nn.Module, image_path: Path) -> dict[str, Any]:
+    """Run inference on one image. Output shape mirrors mri_model.predict_with_proba."""
+    image_path = Path(image_path)
+    if not image_path.exists():
+        raise FileNotFoundError(f"MRI image not found: {image_path}")
+    img = Image.open(str(image_path)).convert("RGB")
+    tensor = _TRANSFORM(img).unsqueeze(0)  # (1, 3, 160, 160)
+    with torch.inference_mode():
+        logits = model(tensor)
+        probs = torch.softmax(logits, dim=1).squeeze(0).cpu().numpy()
+    label_idx = int(np.argmax(probs))
+    return {
+        "label": label_idx,
+        "label_text": IDX_TO_CLASS[label_idx],
+        "confidence": float(probs[label_idx]),
+        "probabilities": [
+            {"label": i, "label_text": IDX_TO_CLASS[i], "probability": float(p)}
+            for i, p in enumerate(probs)
+        ],
+    }
+```
+Run: `pytest tests/models/test_mri_dl_2d.py -v` → expect 5 passed.
+- [ ] **Step 4:** `pytest -q` → expect 295+1 baseline + 5 new = ~300 passed.
+- [ ] **Step 5:** commit:
+```bash
+git add src/models/mri_dl_2d.py tests/fixtures/build_dummy_resnet18_2d.py tests/models/test_mri_dl_2d.py
+git commit -m "feat(models): add 2D resnet18 4-class Alzheimer's MRI inference module"
+```
+---
+### Task 2: Selector for 3D vs 2D
+**Files:**
+- Create: `src/models/mri_selector.py`
+- Create: `tests/models/test_mri_selector.py`
+- [ ] **Step 1: Failing test.**
+`tests/models/test_mri_selector.py`:
+```python
+"""Tests for src.models.mri_selector — env-var-driven 2D / 3D dispatch."""
+from __future__ import annotations
+from pathlib import Path
+import pytest
+from src.models import mri_selector
+from tests.fixtures.build_dummy_mri_onnx import build as build_dummy_3d
+from tests.fixtures.build_dummy_resnet18_2d import build as build_dummy_2d
+_FIXTURE_MRI = Path(__file__).resolve().parents[1] / "fixtures" / "mri_sample" / "subject_0.nii.gz"
+class TestSelector:
+    def test_default_kind_is_volumetric(self, monkeypatch) -> None:
+        monkeypatch.delenv("MRI_MODEL_KIND", raising=False)
+        assert mri_selector.current_kind() == "volumetric_onnx"
+    def test_explicit_2d_selection(self, monkeypatch) -> None:
+        monkeypatch.setenv("MRI_MODEL_KIND", "resnet18_2d")
+        assert mri_selector.current_kind() == "resnet18_2d"
+    def test_unknown_kind_raises(self, monkeypatch) -> None:
+        monkeypatch.setenv("MRI_MODEL_KIND", "neural_net_supreme")
+        with pytest.raises(ValueError, match="unknown MRI_MODEL_KIND"):
+            mri_selector.current_kind()
+    def test_predict_routes_to_volumetric(self, monkeypatch, tmp_path) -> None:
+        monkeypatch.setenv("MRI_MODEL_KIND", "volumetric_onnx")
+        artifact = build_dummy_3d(tmp_path / "vol.onnx")
+        result = mri_selector.predict(
+            input_path=_FIXTURE_MRI,
+            checkpoint_path=artifact,
+            target_shape=(8, 8, 8),
+            label_names=("control", "abnormal"),
+        )
+        assert result["label_text"] in {"control", "abnormal"}
+    def test_predict_routes_to_2d(self, monkeypatch, tmp_path) -> None:
+        monkeypatch.setenv("MRI_MODEL_KIND", "resnet18_2d")
+        artifact = build_dummy_2d(tmp_path / "best.pt")
+        # Build a tiny PNG.
+        from PIL import Image
+        import numpy as np
+        img_path = tmp_path / "scan.png"
+        Image.fromarray((np.random.RandomState(0).rand(160, 160, 3) * 255).astype("uint8")).save(str(img_path))
+        result = mri_selector.predict(
+            input_path=img_path,
+            checkpoint_path=artifact,
+        )
+        assert result["label_text"] in mri_selector.label_names_for_kind("resnet18_2d")
+```
+Run: `pytest tests/models/test_mri_selector.py -v` → ImportError.
+- [ ] **Step 2: Minimal impl.**
+`src/models/mri_selector.py`:
+```python
+"""Env-var-driven dispatch between volumetric ONNX and 2D resnet18 MRI models."""
+from __future__ import annotations
+import os
+from pathlib import Path
+from typing import Any
+from src.core.logger import get_logger
+from src.models import mri_dl_2d, mri_model
+logger = get_logger(__name__)
+VALID_KINDS = ("volumetric_onnx", "resnet18_2d")
+_DEFAULT_KIND = "volumetric_onnx"
+def current_kind() -> str:
+    kind = os.environ.get("MRI_MODEL_KIND", _DEFAULT_KIND)
+    if kind not in VALID_KINDS:
+        raise ValueError(f"unknown MRI_MODEL_KIND={kind!r}; expected one of {VALID_KINDS}")
+    return kind
+def label_names_for_kind(kind: str) -> tuple[str, ...]:
+    if kind == "resnet18_2d":
+        return tuple(mri_dl_2d.IDX_TO_CLASS[i] for i in range(len(mri_dl_2d.CLASS_TO_IDX)))
+    return mri_model.DEFAULT_LABEL_NAMES
+def predict(
+    input_path: Path,
+    checkpoint_path: Path,
+    target_shape: tuple[int, int, int] | None = None,
+    label_names: tuple[str, ...] | None = None,
+) -> dict[str, Any]:
+    """Run the active MRI model on one input. Returns the unified prediction dict."""
+    kind = current_kind()
+    logger.info("dispatching MRI prediction kind=%s input=%s", kind, input_path)
+    if kind == "resnet18_2d":
+        model = mri_dl_2d.load(checkpoint_path)
+        return mri_dl_2d.predict_image(model, input_path)
+    model = mri_model.load(checkpoint_path)
+    return mri_model.predict_nifti(
+        model,
+        input_path,
+        target_shape=target_shape or mri_model.DEFAULT_TARGET_SHAPE,
+        label_names=label_names,
+    )
+```
+Run tests → 5 passed.
+- [ ] **Step 3:** `pytest -q` → ~305 passed.
+- [ ] **Step 4:** commit: `feat(models): selector dispatch for volumetric vs 2D MRI models`.
+---
+### Task 3: Wire into `POST /predict/mri`
+**Files:**
+- Modify: `src/api/routes.py`
+- Modify: `src/api/schemas.py` (docstring only — `input_path` now optionally accepts a 2D image)
+- Create: `tests/api/test_mri_2d_route.py`
+- [ ] **Step 1: Failing test.**
+`tests/api/test_mri_2d_route.py`:
+```python
+"""Integration: POST /predict/mri with MRI_MODEL_KIND=resnet18_2d."""
+from __future__ import annotations
+import os
+from pathlib import Path
+import numpy as np
+import pytest
+from fastapi.testclient import TestClient
+from PIL import Image
+from src.api.main import app
+from tests.fixtures.build_dummy_resnet18_2d import build as build_dummy_2d
+@pytest.fixture()
+def client_2d(monkeypatch, tmp_path):
+    monkeypatch.setenv("MRI_MODEL_KIND", "resnet18_2d")
+    ckpt = build_dummy_2d(tmp_path / "best.pt")
+    monkeypatch.setenv("MRI_MODEL_PATH_2D", str(ckpt))
+    return TestClient(app)
+def test_predict_mri_2d_happy_path(client_2d, tmp_path):
+    # Tiny RGB PNG.
+    img_path = tmp_path / "scan.png"
+    Image.fromarray((np.random.RandomState(0).rand(170, 170, 3) * 255).astype("uint8")).save(str(img_path))
+    r = client_2d.post("/predict/mri", json={"input_path": str(img_path)})
+    assert r.status_code == 200, r.text
+    data = r.json()
+    assert data["label_text"] in {
+        "MildDemented", "ModerateDemented", "NonDemented", "VeryMildDemented",
+    }
+    assert 0.0 <= data["confidence"] <= 1.0
+    assert len(data["probabilities"]) == 4
+```
+Run → expect 500 (route hardcoded to volumetric path) or schema error.
+- [ ] **Step 2: Modify the route handler.**
+In `src/api/routes.py`, find `predict_mri` (around line 318). Change the body to dispatch via the selector:
+```python
+@predict_router.post("/mri", response_model=MRIPredictResponse)
+def predict_mri(req: MRIPredictRequest) -> MRIPredictResponse:
+    from src.models import mri_selector
+    kind = mri_selector.current_kind()
+    if kind == "resnet18_2d":
+        ckpt = Path(os.environ.get("MRI_MODEL_PATH_2D", "data/processed/mri_dl_2d/best_model.pt"))
+        result = mri_selector.predict(input_path=Path(req.input_path), checkpoint_path=ckpt)
+        model_path = str(ckpt)
+    else:
+        ckpt = _mri_model_path()
+        result = mri_selector.predict(
+            input_path=Path(req.input_path),
+            checkpoint_path=ckpt,
+            target_shape=tuple(req.target_shape),
+            label_names=tuple(req.label_names) if req.label_names else None,
+        )
+        model_path = str(ckpt)
+    return MRIPredictResponse(
+        **result,
+        input_path=str(req.input_path),
+        model_path=model_path,
+    )
+```
+You'll need to add `import os` and `from pathlib import Path` if not already present at the top of the file (Path likely already is). Keep the existing `_mri_model_path()` helper.
+- [ ] **Step 3:** Update `src/api/schemas.py`. The class `MRIPredictRequest.input_path` description currently says "Path to one .nii or .nii.gz MRI volume". Change to:
+```python
+    input_path: str = Field(..., description="Path to MRI input. With MRI_MODEL_KIND=volumetric_onnx (default), expects a .nii/.nii.gz volume. With MRI_MODEL_KIND=resnet18_2d, expects a 2D image (.png/.jpg).")
+```
+- [ ] **Step 4:** `pytest tests/api/test_mri_2d_route.py -v` → expect 1 passed.
+- [ ] **Step 5:** `pytest -q` → expect no regressions vs the prior baseline + 1 new.
+- [ ] **Step 6:** commit: `feat(api): dispatch /predict/mri via MRI_MODEL_KIND env var`.
+---
+### Task 4: Sanity check on real artifact (one-shot, runs only when artifact is present)
+**Files:**
+- Create: `tests/models/test_mri_dl_2d_real.py`
+This test is opt-in via env: only runs if `data/processed/mri_dl_2d/best_model.pt` is present. Catches the "trainer used different class index order" bug.
+- [ ] **Step 1: Test.**
+```python
+"""Real-artifact sanity test. Skipped unless the checkpoint is present."""
+from __future__ import annotations
+from pathlib import Path
+import numpy as np
+import pytest
+from PIL import Image
+from src.models import mri_dl_2d
+REAL_CKPT = Path("data/processed/mri_dl_2d/best_model.pt")
+@pytest.mark.skipif(not REAL_CKPT.exists(), reason="real MRI checkpoint not present")
+def test_real_checkpoint_loads_and_predicts(tmp_path):
+    model = mri_dl_2d.load(REAL_CKPT)
+    arr = (np.random.RandomState(0).rand(170, 170, 3) * 255).astype(np.uint8)
+    img = tmp_path / "scan.png"
+    Image.fromarray(arr).save(str(img))
+    result = mri_dl_2d.predict_image(model, img)
+    assert result["label_text"] in mri_dl_2d.CLASS_TO_IDX
+    # Probabilities sum to 1.
+    s = sum(p["probability"] for p in result["probabilities"])
+    assert abs(s - 1.0) < 1e-5
+```
+- [ ] **Step 2:** `pytest tests/models/test_mri_dl_2d_real.py -v` → if no real checkpoint, **skipped** (expected). When you do drop the artifact in, `pytest -q` will run it.
+- [ ] **Step 3:** commit: `test(models): real-artifact sanity for MRI DL 2D (skips when absent)`.
+---
+### Task 5: Streamlit + README
+**Files:**
+- Modify: `src/frontend/app.py` (the MRI Predict tab — add `MRI_MODEL_KIND` indicator + accept image upload when 2D is active)
+- Modify: `README.md`
+- [ ] **Step 1:** In `src/frontend/app.py`, find the MRI predict section (likely near line 1330 — search for `mri_predict_d`). Add a small caption above the existing UI:
+```python
+mri_kind = os.environ.get("MRI_MODEL_KIND", "volumetric_onnx")
+st.caption(f"Active MRI model: `{mri_kind}` (set `MRI_MODEL_KIND` env to switch)")
+```
+If `mri_kind == "resnet18_2d"`, swap the file picker hint from `.nii/.nii.gz` to `.png/.jpg`. The existing `target_shape` widgets become irrelevant in 2D mode — wrap them in `if mri_kind == "volumetric_onnx":`.
+- [ ] **Step 2:** README update. Add a paragraph under the existing MRI section:
+```markdown
+### MRI Deep-Learning Backends
+The MRI prediction route supports two backends, selected via env:
+- `MRI_MODEL_KIND=volumetric_onnx` (default). Loads an ONNX volumetric model from `MRI_MODEL_PATH` (default `data/processed/mri_model.onnx`). Input: `.nii` / `.nii.gz`.
+- `MRI_MODEL_KIND=resnet18_2d`. Loads a PyTorch state_dict from `MRI_MODEL_PATH_2D` (default `data/processed/mri_dl_2d/best_model.pt`). Input: 2D image (`.png` / `.jpg`). Classes: `MildDemented`, `ModerateDemented`, `NonDemented`, `VeryMildDemented`.
+Switch backends without restarting workers — env is read on each request.
+```
+- [ ] **Step 3:** `pytest -q` → no regressions. (Streamlit code is not unit-tested in this repo — manual smoke at the end is fine.)
+- [ ] **Step 4:** commit: `feat(frontend): expose MRI_MODEL_KIND in MRI predict tab; doc backends`.
+---
+## Self-review checklist
+1. **Spec coverage.** Trainer's BEST_PARAMS that matter at inference: `image_size=160`, `model_name=resnet18`, class index map. All locked in via Task 1 constants. The other params (`lr`, `epochs`, `batch_size`) are training-only and intentionally not surfaced.
+2. **Independence.** No coupling to BBB. The new module imports only stdlib + torch + torchvision + Pillow + numpy + the existing `src/core/logger`.
+3. **Sanity test for class-order drift.** Task 4 runs only when the real checkpoint is dropped in. If the trainer used `ImageFolder`'s alphabetical order (`MildDemented=0, ModerateDemented=1, NonDemented=2, VeryMildDemented=3` — same as ours, by luck), it passes. If they used a different order, the user must update `CLASS_TO_IDX` in `mri_dl_2d.py`.
+4. **No placeholders.** Every step contains the full code.
+5. **Hackathon-grade.** No XAI / saliency-map / Grad-CAM scope creep. That's a separate sub-plan if the demo demands it.
+---
+## Execution handoff
+Save and choose: subagent-driven (recommended) or inline executing-plans.

docs/superpowers/plans/2026-05-02-oasis-tabular-fusion-integration.md ADDED Viewed

	@@ -0,0 +1,641 @@

+# OASIS Tabular Classifier — Fusion Integration Plan
+> **For agentic workers:** REQUIRED SUB-SKILL: `superpowers:subagent-driven-development`. TDD throughout.
+## ⚠️ Important context — read before executing
+The user said "I have the pretrained model for eeg, integrate it into the eeg pipeline. its the ipynb file named detecting-early-alzheimers...".
+The notebook (`/Users/mertgungor/Downloads/rag/detecting-early-alzheimer-s (1).ipynb`) is **NOT an EEG model**. It is an sklearn ensemble (LogReg / SVM / DT / RF / AdaBoost) trained on the OASIS longitudinal **tabular** dataset — features are MMSE, eTIV, nWBV, ASF, EDUC, SES, M/F, Age. Zero EEG signal processing. Zero saved model artifact (the notebook trains in-memory only).
+This plan therefore has **two branches**. Pick one with the user before executing.
+### Branch 3a — Train + integrate the OASIS *tabular* classifier as a fusion feature
+We re-train the best variant (Random Forest, AUC 84.4 % per the notebook) from the OASIS CSV, save a `joblib` artifact, and expose it as a fusion-engine modality named `tabular_oasis`. The fusion engine already handles arbitrary modality keys; this plugs in cleanly.
+**Demo value:** When a doctor has only OASIS-style biomarkers (MMSE / eTIV / nWBV / ASF / Age / EDUC / SES / M/F) but no MRI image, the fusion engine still produces an Alzheimer's confidence with attribution.
+### Branch 3b — User has a real EEG model elsewhere
+If the user can point us to a checkpoint that consumes raw FIF / EDF EEG data (e.g., a `.pt`, `.pth`, `.h5`, `.onnx`, or `.joblib` file) and emits Alzheimer's class probabilities, this plan is rewritten around that artifact: signature, expected input shape, label order. We replace `src/models/eeg_model.py` (currently absent — `eeg_pipeline.py` only does signal processing) with a new module similar to `mri_dl_2d.py`.
+**The user must pick a branch** before any task starts. The default below is **Branch 3a**, because the notebook is what's actually on disk.
+---
+## Branch 3a (default): OASIS tabular classifier as fusion modality
+**Goal.** Save a Random Forest trained on OASIS biomarkers; wire it into the fusion engine as a new modality `tabular_oasis`. The doctor enters MMSE/eTIV/nWBV/ASF (fusion already takes MMSE; this extends to the other three) and gets an Alzheimer's signal that flows through the existing logit/sigmoid combiner.
+**Architecture.** New module `src/models/tabular_oasis.py` trains-or-loads a `joblib`-pickled `Pipeline(scaler -> RandomForestClassifier)`. The fusion engine grows one entry in `_CLINICAL_FNS` (or, more cleanly, a sibling `_TABULAR_FNS`) so the model's class probability for `Demented=1` becomes a signed signal. New API route `POST /predict/tabular_oasis` lets the frontend call it directly. All optional — if the OASIS CSV is absent, the module degrades gracefully and fusion ignores the modality.
+**Tech stack.** scikit-learn (already in deps), pandas, joblib (likely in deps via sklearn).
+---
+## Prerequisite (controller blocker)
+The OASIS dataset is not in this repo. Two acquisition options:
+1. **Download from Kaggle** (https://www.kaggle.com/datasets/jboysen/mri-and-alzheimers, file `oasis_longitudinal.csv`). Save to `data/external/oasis_longitudinal.csv`. Gitignore (already covered by `data/external_rag/` if you broaden it; otherwise add `data/external/`).
+2. **Use a local copy** if the user already downloaded it for the notebook. Same destination.
+If the dataset is unavailable, **stop and surface to the user**. The classifier cannot be trained without it; we will not fabricate synthetic OASIS-shaped data for a clinical demo.
+---
+## File structure
+| Path | Responsibility |
+|---|---|
+| Modify `requirements.txt` | confirm `joblib` (sklearn pulls it transitively but pin explicitly is safer) |
+| Modify `.gitignore` | ensure `data/external/` is ignored |
+| Create `src/models/tabular_oasis.py` | train + persist + load + predict the OASIS RF classifier |
+| Create `scripts/train_oasis.py` | one-shot CLI: trains and saves the model artifact |
+| Modify `src/fusion/types.py` | extend `ClinicalScores` with `etiv`, `nwbv`, `asf`, `educ`, `ses`, `is_male` |
+| Modify `src/fusion/weights.py` | add `tabular_oasis` weight key for `alzheimers` |
+| Modify `src/fusion/engine.py` | add `tabular_oasis` to the modality dispatch |
+| Modify `src/api/routes.py` | new route `POST /predict/tabular_oasis` |
+| Modify `src/api/schemas.py` | request/response for the new route |
+| Create `tests/models/test_tabular_oasis.py` | training + persistence + prediction tests |
+| Create `tests/fixtures/build_synthetic_oasis.py` | synthetic OASIS-shaped CSV for tests (clearly labelled non-clinical) |
+| Create `tests/fusion/test_tabular_oasis_modality.py` | fusion-side integration |
+| Create `tests/api/test_tabular_oasis_route.py` | API integration |
+| Modify `README.md` | document the modality + how to acquire the OASIS CSV |
+---
+## Tasks
+### Task 0: Deps + ignore
+**Files:** `requirements.txt`, `.gitignore`
+- [ ] **Step 1:** verify `joblib` and `pandas` are in `requirements.txt`. `pandas` already is (used by every pipeline). Add `joblib>=1.3,<2.0` if not pinned.
+- [ ] **Step 2:** `.gitignore` should cover `data/external/`. Add it if needed.
+- [ ] **Step 3:** `pytest -q` baseline. Commit: `chore(oasis): pin joblib; gitignore external dataset dir`.
+---
+### Task 1: Training + persistence module
+**Files:**
+- Create: `src/models/tabular_oasis.py`
+- Create: `scripts/train_oasis.py`
+- Create: `tests/fixtures/build_synthetic_oasis.py`
+- Create: `tests/models/test_tabular_oasis.py`
+- [ ] **Step 1: Synthetic-fixture helper** (clearly synthetic — never confused with real clinical data):
+`tests/fixtures/build_synthetic_oasis.py`:
+```python
+"""Build a synthetic OASIS-shaped CSV for tests. NON-CLINICAL data."""
+from __future__ import annotations
+from pathlib import Path
+import numpy as np
+import pandas as pd
+def build(path: Path, n: int = 200, seed: int = 42) -> Path:
+    """Save a synthetic CSV at `path` with the columns the trainer expects."""
+    path = Path(path)
+    if path.exists():
+        return path
+    rng = np.random.default_rng(seed)
+    n_dem = n // 2
+    # Demented half — lower MMSE, higher CDR, smaller nWBV.
+    dem = pd.DataFrame({
+        "Group":   ["Demented"] * n_dem,
+        "M/F":     rng.choice(["M", "F"], n_dem),
+        "Age":     rng.integers(70, 95, n_dem),
+        "EDUC":    rng.integers(8, 18, n_dem),
+        "SES":     rng.integers(1, 5, n_dem),
+        "MMSE":    rng.integers(15, 26, n_dem),
+        "CDR":     rng.choice([0.5, 1.0], n_dem),
+        "eTIV":    rng.integers(1200, 1700, n_dem),
+        "nWBV":    rng.uniform(0.65, 0.74, n_dem),
+        "ASF":     rng.uniform(1.0, 1.4, n_dem),
+        "Visit":   1,
+        "Hand":    "R",
+    })
+    nondem = pd.DataFrame({
+        "Group":   ["Nondemented"] * (n - n_dem),
+        "M/F":     rng.choice(["M", "F"], n - n_dem),
+        "Age":     rng.integers(60, 90, n - n_dem),
+        "EDUC":    rng.integers(10, 22, n - n_dem),
+        "SES":     rng.integers(1, 5, n - n_dem),
+        "MMSE":    rng.integers(26, 31, n - n_dem),
+        "CDR":     rng.choice([0.0], n - n_dem),
+        "eTIV":    rng.integers(1300, 1900, n - n_dem),
+        "nWBV":    rng.uniform(0.70, 0.83, n - n_dem),
+        "ASF":     rng.uniform(0.9, 1.5, n - n_dem),
+        "Visit":   1,
+        "Hand":    "R",
+    })
+    pd.concat([dem, nondem], ignore_index=True).to_csv(path, index=False)
+    return path
+```
+- [ ] **Step 2: Failing test.**
+`tests/models/test_tabular_oasis.py`:
+```python
+"""Tests for src.models.tabular_oasis."""
+from __future__ import annotations
+from pathlib import Path
+import pytest
+from src.models import tabular_oasis
+from tests.fixtures.build_synthetic_oasis import build as build_synth
+class TestTrainAndPredict:
+    def test_train_persists_loadable_artifact(self, tmp_path: Path) -> None:
+        csv = build_synth(tmp_path / "oasis.csv")
+        artifact = tabular_oasis.train_from_csv(csv, tmp_path / "rf.joblib")
+        assert artifact.exists()
+        loaded = tabular_oasis.load(artifact)
+        assert hasattr(loaded, "predict_proba")
+    def test_predict_returns_full_dict(self, tmp_path: Path) -> None:
+        csv = build_synth(tmp_path / "oasis.csv")
+        artifact = tabular_oasis.train_from_csv(csv, tmp_path / "rf.joblib")
+        model = tabular_oasis.load(artifact)
+        out = tabular_oasis.predict_one(model, {
+            "is_male": 1, "age": 80, "educ": 10, "ses": 3.0,
+            "mmse": 18.0, "etiv": 1500.0, "nwbv": 0.68, "asf": 1.2,
+        })
+        assert set(out) == {"label", "label_text", "confidence", "probabilities"}
+        assert out["label"] in {0, 1}
+        assert out["label_text"] in {"Nondemented", "Demented"}
+        assert 0.0 <= out["confidence"] <= 1.0
+        probs = out["probabilities"]
+        assert len(probs) == 2
+        assert abs(sum(p["probability"] for p in probs) - 1.0) < 1e-5
+    def test_predict_with_synthetic_demented_profile_yields_demented_label(self, tmp_path: Path) -> None:
+        # The synthetic data has clean separation, so a clearly-demented profile
+        # (MMSE=15, low nWBV, age 88) should classify as Demented.
+        csv = build_synth(tmp_path / "oasis.csv")
+        artifact = tabular_oasis.train_from_csv(csv, tmp_path / "rf.joblib")
+        model = tabular_oasis.load(artifact)
+        out = tabular_oasis.predict_one(model, {
+            "is_male": 1, "age": 88, "educ": 8, "ses": 3.0,
+            "mmse": 15.0, "etiv": 1300.0, "nwbv": 0.66, "asf": 1.3,
+        })
+        assert out["label_text"] == "Demented"
+    def test_load_missing_artifact_raises(self, tmp_path: Path) -> None:
+        with pytest.raises(FileNotFoundError, match="OASIS classifier artifact not found"):
+            tabular_oasis.load(tmp_path / "missing.joblib")
+```
+Run → ImportError.
+- [ ] **Step 3: Minimal impl.**
+`src/models/tabular_oasis.py`:
+```python
+"""OASIS tabular Alzheimer's classifier — Random Forest with full pipeline."""
+from __future__ import annotations
+from pathlib import Path
+from typing import Any
+import joblib
+import numpy as np
+import pandas as pd
+from sklearn.ensemble import RandomForestClassifier
+from sklearn.pipeline import Pipeline
+from sklearn.preprocessing import MinMaxScaler
+from src.core.logger import get_logger
+logger = get_logger(__name__)
+FEATURE_ORDER: tuple[str, ...] = (
+    "is_male", "age", "educ", "ses", "mmse", "etiv", "nwbv", "asf",
+)
+LABEL_NAMES: tuple[str, ...] = ("Nondemented", "Demented")
+def _df_from_oasis_csv(csv_path: Path) -> tuple[pd.DataFrame, pd.Series]:
+    """Replicate the notebook's preprocessing: first visit only, M/F encoded,
+    Converted-as-Demented, drop unused columns, median-impute SES on EDUC."""
+    df = pd.read_csv(csv_path)
+    df = df.loc[df["Visit"] == 1].reset_index(drop=True)
+    df["M/F"] = df["M/F"].replace({"F": 0, "M": 1})
+    df["Group"] = df["Group"].replace({"Converted": "Demented"}).replace(
+        {"Demented": 1, "Nondemented": 0}
+    )
+    df = df.drop(columns=[c for c in ("MRI ID", "Visit", "Hand") if c in df.columns])
+    df["SES"] = df["SES"].fillna(df.groupby("EDUC")["SES"].transform("median"))
+    feature_df = pd.DataFrame({
+        "is_male": df["M/F"].astype(float),
+        "age":     df["Age"].astype(float),
+        "educ":    df["EDUC"].astype(float),
+        "ses":     df["SES"].astype(float),
+        "mmse":    df["MMSE"].astype(float),
+        "etiv":    df["eTIV"].astype(float),
+        "nwbv":    df["nWBV"].astype(float),
+        "asf":     df["ASF"].astype(float),
+    })[list(FEATURE_ORDER)]
+    return feature_df, df["Group"].astype(int)
+def train_from_csv(csv_path: Path, artifact_path: Path) -> Path:
+    """Train and persist a MinMaxScaler→RandomForest pipeline. Returns artifact path."""
+    csv_path = Path(csv_path)
+    artifact_path = Path(artifact_path)
+    if not csv_path.exists():
+        raise FileNotFoundError(f"OASIS CSV not found: {csv_path}")
+    X, y = _df_from_oasis_csv(csv_path)
+    pipeline = Pipeline([
+        ("scaler", MinMaxScaler()),
+        ("rf",     RandomForestClassifier(
+            n_estimators=12, max_depth=8, max_features=8,
+            n_jobs=4, random_state=0,
+        )),
+    ])
+    pipeline.fit(X, y)
+    artifact_path.parent.mkdir(parents=True, exist_ok=True)
+    joblib.dump(pipeline, artifact_path)
+    logger.info("trained OASIS RF: n=%d, artifact=%s", len(X), artifact_path)
+    return artifact_path
+def load(artifact_path: Path) -> Pipeline:
+    p = Path(artifact_path)
+    if not p.exists():
+        raise FileNotFoundError(f"OASIS classifier artifact not found: {p}")
+    return joblib.load(p)
+def predict_one(model: Pipeline, features: dict[str, float]) -> dict[str, Any]:
+    """Predict for a single subject. `features` must have all FEATURE_ORDER keys."""
+    missing = [k for k in FEATURE_ORDER if k not in features]
+    if missing:
+        raise ValueError(f"OASIS prediction missing features: {missing}")
+    row = pd.DataFrame([{k: float(features[k]) for k in FEATURE_ORDER}])
+    probs = np.asarray(model.predict_proba(row))[0]
+    label_idx = int(np.argmax(probs))
+    return {
+        "label": label_idx,
+        "label_text": LABEL_NAMES[label_idx],
+        "confidence": float(probs[label_idx]),
+        "probabilities": [
+            {"label": i, "label_text": LABEL_NAMES[i], "probability": float(p)}
+            for i, p in enumerate(probs)
+        ],
+    }
+```
+`scripts/train_oasis.py`:
+```python
+"""CLI: train the OASIS RF classifier and save it.
+Usage:
+    python scripts/train_oasis.py data/external/oasis_longitudinal.csv data/processed/oasis_rf.joblib
+"""
+from __future__ import annotations
+import sys
+from pathlib import Path
+from src.models.tabular_oasis import train_from_csv
+def main() -> None:
+    if len(sys.argv) != 3:
+        print(__doc__)
+        sys.exit(1)
+    csv = Path(sys.argv[1])
+    out = Path(sys.argv[2])
+    train_from_csv(csv, out)
+    print(f"saved: {out}")
+if __name__ == "__main__":
+    main()
+```
+Run tests → 4 passed.
+- [ ] **Step 4:** commit: `feat(models): OASIS tabular Alzheimer's RF classifier (joblib + train CLI)`.
+---
+### Task 2: Extend fusion's clinical inputs
+**Files:**
+- Modify: `src/fusion/types.py` (extend `ClinicalScores`)
+- Modify: `src/fusion/clinical.py` (add normalisers for the new fields)
+- Modify: `tests/fusion/test_types.py` (loosen / extend bound tests)
+- Modify: `tests/fusion/test_clinical.py` (add new normaliser tests)
+- [ ] **Step 1: Failing test for new ClinicalScores fields.**
+In `tests/fusion/test_types.py`, append:
+```python
+class TestExtendedClinicalScores:
+    def test_etiv_in_range(self) -> None:
+        s = ClinicalScores(etiv=1500.0)
+        assert s.etiv == pytest.approx(1500.0)
+    def test_etiv_out_of_range_rejected(self) -> None:
+        with pytest.raises(ValidationError):
+            ClinicalScores(etiv=5000.0)
+    def test_nwbv_in_range(self) -> None:
+        s = ClinicalScores(nwbv=0.72)
+        assert s.nwbv == pytest.approx(0.72)
+```
+- [ ] **Step 2: Update `src/fusion/types.py` ClinicalScores.**
+Add fields (preserve existing ones):
+```python
+class ClinicalScores(BaseModel):
+    mmse: Annotated[float, Field(ge=0.0, le=30.0)] | None = None
+    moca: Annotated[float, Field(ge=0.0, le=30.0)] | None = None
+    updrs: Annotated[float, Field(ge=0.0, le=199.0)] | None = None
+    gait_speed_m_s: Annotated[float, Field(ge=0.0, le=2.5)] | None = None
+    age_years: Annotated[float, Field(ge=0.0, le=120.0)] | None = None
+    # OASIS biomarkers — used by the tabular_oasis modality.
+    etiv: Annotated[float, Field(ge=900.0, le=2200.0)] | None = None
+    nwbv: Annotated[float, Field(ge=0.5, le=0.95)] | None = None
+    asf:  Annotated[float, Field(ge=0.5, le=2.0)]  | None = None
+    educ: Annotated[float, Field(ge=0.0, le=30.0)] | None = None
+    ses:  Annotated[float, Field(ge=1.0, le=5.0)]  | None = None
+    is_male: Annotated[int, Field(ge=0, le=1)]     | None = None
+```
+- [ ] **Step 3:** the tests should pass after the type change. `pytest tests/fusion/test_types.py -v`.
+- [ ] **Step 4:** commit: `feat(fusion): extend ClinicalScores with OASIS biomarker fields`.
+---
+### Task 3: Wire `tabular_oasis` modality into the fusion engine
+**Files:**
+- Modify: `src/fusion/weights.py`
+- Modify: `src/fusion/engine.py`
+- Create: `tests/fusion/test_tabular_oasis_modality.py`
+- [ ] **Step 1: Update weights.**
+`src/fusion/weights.py`, in the `alzheimers` table:
+```python
+"alzheimers": {
+    "mri":              0.25,   # was 0.35
+    "eeg":              0.15,   # was 0.20
+    "tabular_oasis":    0.20,   # new
+    "clinical_mmse":    0.20,
+    "clinical_moca":    0.10,   # was 0.15
+    "clinical_age":     0.10,
+},
+```
+Re-balance so the table still sums to 1.0. Add a comment that re-balancing changed the existing tests' tolerances — verify which tests need updating.
+- [ ] **Step 2: Failing fusion-modality test.**
+`tests/fusion/test_tabular_oasis_modality.py`:
+```python
+"""Tests: tabular_oasis modality contributes to alzheimers fusion score."""
+from __future__ import annotations
+import os
+from pathlib import Path
+import pytest
+from src.fusion import engine
+from src.fusion.types import ClinicalScores, FusionInput
+from src.models.tabular_oasis import train_from_csv
+from tests.fixtures.build_synthetic_oasis import build as build_synth
+@pytest.fixture()
+def trained_artifact(tmp_path: Path, monkeypatch) -> Path:
+    csv = build_synth(tmp_path / "oasis.csv")
+    art = train_from_csv(csv, tmp_path / "rf.joblib")
+    monkeypatch.setenv("OASIS_RF_ARTIFACT", str(art))
+    return art
+class TestTabularOasisModality:
+    def test_demented_profile_raises_alzheimers(self, trained_artifact: Path) -> None:
+        out = engine.fuse(FusionInput(clinical=ClinicalScores(
+            is_male=1, age_years=88, educ=8, ses=3.0,
+            mmse=15.0, etiv=1300.0, nwbv=0.66, asf=1.3,
+        )))
+        alz = next(d for d in out.diseases if d.disease == "alzheimers")
+        assert alz.probability > 0.6
+        assert any(c.modality == "tabular_oasis" for c in alz.contributions)
+    def test_missing_oasis_inputs_skips_modality(self, trained_artifact: Path) -> None:
+        # MMSE alone but no etiv/nwbv → tabular_oasis should be skipped, not error.
+        out = engine.fuse(FusionInput(clinical=ClinicalScores(mmse=12.0)))
+        alz = next(d for d in out.diseases if d.disease == "alzheimers")
+        names = {c.modality for c in alz.contributions}
+        assert "tabular_oasis" not in names
+```
+- [ ] **Step 3: Update the engine.**
+In `src/fusion/engine.py`, add a tabular-modality dispatcher that lazy-loads the joblib artifact once and treats the OASIS classifier's `P(Demented)` as the alzheimers signal `2*P-1`:
+```python
+import os
+_oasis_cache: dict[str, Any] = {}
+def _signal_for_tabular_oasis(disease: str, clinical: ClinicalScores) -> float | None:
+    if disease != "alzheimers":
+        return None
+    required = ("is_male", "age_years", "educ", "ses", "mmse", "etiv", "nwbv", "asf")
+    if any(getattr(clinical, k, None) is None for k in required):
+        return None
+    artifact = os.environ.get("OASIS_RF_ARTIFACT", "data/processed/oasis_rf.joblib")
+    artifact_path = Path(artifact)
+    if not artifact_path.exists():
+        logger.warning("tabular_oasis artifact missing at %s; skipping modality", artifact_path)
+        return None
+    if "model" not in _oasis_cache:
+        from src.models.tabular_oasis import load
+        _oasis_cache["model"] = load(artifact_path)
+    from src.models.tabular_oasis import predict_one
+    feats = {
+        "is_male": int(clinical.is_male),
+        "age":     float(clinical.age_years),
+        "educ":    float(clinical.educ),
+        "ses":     float(clinical.ses),
+        "mmse":    float(clinical.mmse),
+        "etiv":    float(clinical.etiv),
+        "nwbv":    float(clinical.nwbv),
+        "asf":     float(clinical.asf),
+    }
+    pred = predict_one(_oasis_cache["model"], feats)
+    p_dem = next(p["probability"] for p in pred["probabilities"] if p["label_text"] == "Demented")
+    return 2.0 * p_dem - 1.0
+```
+In `_signal_for_modality`, add the dispatch:
+```python
+if modality_key == "tabular_oasis":
+    return _signal_for_tabular_oasis(disease, clinical)
+```
+- [ ] **Step 4:** `pytest tests/fusion/ -v` — expect re-balancing to perturb a couple of existing thresholds. Adjust thresholds in the affected tests (e.g., the disagreement test) so they still hold with the new weights, OR adjust the new weights so existing tests still pass within tolerance. Prefer the latter — existing thresholds were chosen carefully.
+- [ ] **Step 5:** commit: `feat(fusion): add tabular_oasis modality with lazy joblib load`.
+---
+### Task 4: API + Streamlit + README
+**Files:**
+- Modify: `src/api/routes.py` — add `POST /predict/tabular_oasis`
+- Modify: `src/api/schemas.py` — request/response schemas
+- Modify: `src/frontend/app.py` — extend the Doctor view's clinical-input form with eTIV / nWBV / ASF / EDUC / SES
+- Modify: `README.md` — describe the new modality and the OASIS dataset path
+- [ ] **Step 1: New schemas.**
+`src/api/schemas.py`:
+```python
+class TabularOasisRequest(BaseModel):
+    is_male: int = Field(..., ge=0, le=1)
+    age: float = Field(..., ge=0.0, le=120.0)
+    educ: float = Field(..., ge=0.0, le=30.0)
+    ses: float = Field(..., ge=1.0, le=5.0)
+    mmse: float = Field(..., ge=0.0, le=30.0)
+    etiv: float = Field(..., ge=900.0, le=2200.0)
+    nwbv: float = Field(..., ge=0.5, le=0.95)
+    asf: float = Field(..., ge=0.5, le=2.0)
+class TabularOasisProbability(BaseModel):
+    label: int
+    label_text: str
+    probability: float
+class TabularOasisResponse(BaseModel):
+    label: int
+    label_text: str
+    confidence: float
+    probabilities: list[TabularOasisProbability]
+```
+- [ ] **Step 2: Route.**
+`src/api/routes.py`:
+```python
+@predict_router.post("/tabular_oasis", response_model=TabularOasisResponse)
+def predict_tabular_oasis(req: TabularOasisRequest) -> TabularOasisResponse:
+    from src.models.tabular_oasis import load, predict_one
+    artifact = Path(os.environ.get("OASIS_RF_ARTIFACT", "data/processed/oasis_rf.joblib"))
+    model = load(artifact)
+    out = predict_one(model, req.model_dump())
+    return TabularOasisResponse(**out)
+```
+- [ ] **Step 3: Test (`tests/api/test_tabular_oasis_route.py`).**
+```python
+"""Integration: POST /predict/tabular_oasis."""
+from __future__ import annotations
+from pathlib import Path
+import pytest
+from fastapi.testclient import TestClient
+from src.api.main import app
+from src.models.tabular_oasis import train_from_csv
+from tests.fixtures.build_synthetic_oasis import build as build_synth
+@pytest.fixture()
+def client(monkeypatch, tmp_path):
+    csv = build_synth(tmp_path / "oasis.csv")
+    artifact = train_from_csv(csv, tmp_path / "rf.joblib")
+    monkeypatch.setenv("OASIS_RF_ARTIFACT", str(artifact))
+    return TestClient(app)
+def test_predict_tabular_oasis_demented_profile(client):
+    body = {
+        "is_male": 1, "age": 88, "educ": 8, "ses": 3.0,
+        "mmse": 15.0, "etiv": 1300.0, "nwbv": 0.66, "asf": 1.3,
+    }
+    r = client.post("/predict/tabular_oasis", json=body)
+    assert r.status_code == 200, r.text
+    data = r.json()
+    assert data["label_text"] == "Demented"
+```
+- [ ] **Step 4:** Streamlit form extension. In `src/frontend/app.py`, find the clinical-inputs section the doctor view exposes (likely under a "Clinical scores" expander; if absent, add it under the fusion tab). Add number_input widgets for the seven new fields (`is_male`, `age`, `educ`, `ses`, `etiv`, `nwbv`, `asf`) that flow into the existing `/fusion/predict` payload's `clinical` block.
+- [ ] **Step 5:** README. Append:
+```markdown
+### OASIS Tabular Alzheimer's Classifier
+A scikit-learn Random Forest trained on the OASIS longitudinal dataset (https://www.oasis-brains.org/) classifies Demented vs Nondemented from 8 biomarkers (sex, age, education, SES, MMSE, eTIV, nWBV, ASF). It contributes to the fusion engine as modality `tabular_oasis` (weight 0.20 for Alzheimer's).
+To use: download `oasis_longitudinal.csv` from Kaggle, save to `data/external/oasis_longitudinal.csv`, then:
+```bash
+python scripts/train_oasis.py data/external/oasis_longitudinal.csv data/processed/oasis_rf.joblib
+export OASIS_RF_ARTIFACT=data/processed/oasis_rf.joblib
+```
+The fusion engine and `POST /predict/tabular_oasis` will pick it up. If the artifact is missing, the modality is skipped — fusion still works.
+```
+- [ ] **Step 6:** commit: `feat(oasis): /predict/tabular_oasis route + Streamlit form + README`.
+---
+## Self-review checklist
+1. **Independence.** OASIS classifier and fusion remain decoupled when the artifact is absent (`OASIS_RF_ARTIFACT` unset → modality skipped). ✓
+2. **No real-data fabrication.** Tests use a clearly-labelled synthetic CSV. The real OASIS dataset is never committed. ✓
+3. **Backward compatibility.** Existing `ClinicalScores` fields untouched. New fields are all `Optional`. ✓
+4. **Branch 3a vs 3b.** This plan is Branch 3a. If the user picks Branch 3b, this plan is replaced wholesale.
+---
+## Execution handoff
+Save and choose: subagent-driven (recommended) or inline executing-plans.
+**Reminder to controller:** before starting any task, confirm with the user: "Do you have a real EEG checkpoint I'm missing, or shall I proceed with Branch 3a (OASIS tabular Alzheimer's classifier)?"

docs/superpowers/plans/2026-05-02-tfidf-rag-integration.md ADDED Viewed

	@@ -0,0 +1,653 @@

+# TF-IDF Clinical RAG Integration Plan
+> **For agentic workers:** REQUIRED SUB-SKILL: `superpowers:subagent-driven-development`. TDD throughout.
+**Goal.** Integrate the user's pre-built TF-IDF RAG corpus (14 medical PDFs covering Alzheimer's, Parkinson's, lifestyle, nutrition, exercise; Turkish + English query expansion; pre-built `rag_index.pkl`) into the platform alongside the existing FAISS+fastembed RAG. Both run side-by-side; the agent picks per-query.
+**Architecture.** A new `src/rag/clinical/` sub-package wraps the user's `rag.py` script as an importable module. The existing `retrieve_context` agent tool grows a `corpus` parameter (`"reference"` for FAISS — current behaviour, kept default — `"clinical"` for the new TF-IDF index). A new module-level retriever object is constructed once at startup and reused. Pure addition: existing tests and behaviour stay green.
+**Tech stack.** scikit-learn (already in deps via the existing pipelines? — verify; if not, add), `pypdf`, `numpy`. No FAISS or new embedding model. The user's pickle deserialises with stdlib `pickle` and `sklearn.feature_extraction.text.TfidfVectorizer`.
+---
+## Prerequisite (controller blocker)
+The corpus and index live at `/Users/mertgungor/Downloads/rag/`. Before any task starts:
+1. **Source PDFs.** Copy `Downloads/rag/HACKATHON/*.pdf` to `data/external_rag/clinical_pdfs/` in this repo. **Do NOT commit the PDFs to git** — they are external research papers, possibly copyrighted, and large (~33 MB total). Add `data/external_rag/` to `.gitignore` if not already covered (the existing repo gitignores `data/processed/` — check that this also covers `data/external_rag/`).
+2. **Pre-built index.** Copy `Downloads/rag/index/rag_index.pkl` to `data/external_rag/index/rag_index.pkl`. Also gitignored.
+3. **Verify pickle loads.** `python -c "import pickle; print(list(pickle.load(open('data/external_rag/index/rag_index.pkl','rb')).keys()))"`. Expect: `['created_at', 'source_dir', 'chunk_words', 'overlap_words', 'chunks', 'vectorizer', 'matrix']`.
+If pickle fails to load (sklearn version mismatch, missing module), Task 1 has a regenerate fallback: rebuild the index from the PDFs using the same parameters in `Downloads/rag/rag.py`.
+---
+## File structure
+| Path | Responsibility |
+|---|---|
+| Modify `requirements.txt` | confirm `scikit-learn` and `pypdf` are present (sklearn likely is; pypdf may not be) |
+| Modify `.gitignore` | ensure `data/external_rag/` is ignored |
+| Create `src/rag/clinical/__init__.py` | package marker |
+| Create `src/rag/clinical/types.py` | `ClinicalChunk` dataclass + `ClinicalRetrievalResult` pydantic |
+| Create `src/rag/clinical/loader.py` | unpickle the index; handle the user's payload schema; rebuild from PDFs as fallback |
+| Create `src/rag/clinical/retrieve.py` | TF-IDF query + Turkish/English query expansion + sentence-level evidence picking |
+| Modify `src/agents/tools.py` | `retrieve_context` accepts `corpus: Literal["reference", "clinical"]` |
+| Modify `src/agents/prompts.py` | one-line update describing the corpus parameter |
+| Create `tests/rag/test_clinical_loader.py` | unpickle + rebuild tests using a tiny fixture corpus |
+| Create `tests/rag/test_clinical_retrieve.py` | retrieval correctness tests |
+| Create `tests/agents/test_tools_clinical_corpus.py` | end-to-end agent-tool routing |
+| Create `tests/fixtures/build_tiny_clinical_index.py` | builds a 2-page synthetic PDF index for tests |
+| Modify `README.md` | document the dual-corpus surface |
+---
+## Tasks
+### Task 0: Deps + gitignore + asset copy verification
+**Files:** `requirements.txt`, `.gitignore`
+- [ ] **Step 1:** check `requirements.txt`. If `pypdf` is absent, add `pypdf>=4.0,<6.0`. `scikit-learn` should already be there from the existing pipelines — verify with `pip show scikit-learn`.
+- [ ] **Step 2:** open `.gitignore`. If `data/external_rag/` (or a parent that covers it) is not ignored, add a single line: `data/external_rag/`.
+- [ ] **Step 3:** verify the asset transfer. `ls data/external_rag/clinical_pdfs/*.pdf | wc -l` should print `14` and `ls -lh data/external_rag/index/rag_index.pkl` should show ~12.9 MB.
+- [ ] **Step 4:** if pypdf was added, run `pip install -r requirements.txt`.
+- [ ] **Step 5:** `pytest -q` baseline. Commit: `chore(rag): pin pypdf; gitignore external rag corpus`.
+---
+### Task 1: Loader for the pre-built TF-IDF index
+**Files:**
+- Create: `src/rag/clinical/__init__.py`
+- Create: `src/rag/clinical/types.py`
+- Create: `src/rag/clinical/loader.py`
+- Create: `tests/rag/test_clinical_loader.py`
+- Create: `tests/fixtures/build_tiny_clinical_index.py`
+- [ ] **Step 1: Build the tiny-fixture helper.**
+`tests/fixtures/build_tiny_clinical_index.py`:
+```python
+"""Build a synthetic TF-IDF clinical-RAG index for tests.
+Avoids needing real PDFs. Constructs the same payload schema the user's
+rag.py produces so the loader can be tested independently of pypdf.
+"""
+from __future__ import annotations
+import pickle
+from dataclasses import dataclass
+from datetime import datetime
+from pathlib import Path
+from sklearn.feature_extraction.text import TfidfVectorizer
+# Same schema the user's rag.py produces. We define our own dataclass here
+# so the test fixture is self-contained.
+@dataclass(frozen=True)
+class _Chunk:
+    chunk_id: int
+    source: str
+    page_start: int
+    page_end: int
+    text: str
+def build(path: Path) -> Path:
+    """Save a tiny TF-IDF index at `path`."""
+    path = Path(path)
+    if path.exists():
+        return path
+    path.parent.mkdir(parents=True, exist_ok=True)
+    chunks = [
+        _Chunk(0, "alzheimers_lifestyle.pdf", 1, 1,
+               "Aerobic exercise and Mediterranean diet are associated with reduced cognitive decline in older adults at risk for Alzheimer's disease."),
+        _Chunk(1, "parkinsons_motor.pdf", 1, 1,
+               "Levodopa remains the most effective symptomatic treatment for motor symptoms of Parkinson's disease."),
+        _Chunk(2, "alzheimers_mci.pdf", 2, 2,
+               "Mild cognitive impairment may progress to dementia; MMSE and MoCA are standard screening tools."),
+        _Chunk(3, "parkinsons_nutrition.pdf", 1, 1,
+               "Dietary patterns rich in antioxidants and omega-3 fatty acids are linked to lower Parkinson's risk."),
+    ]
+    vectorizer = TfidfVectorizer(lowercase=True, ngram_range=(1, 2), min_df=1, norm="l2")
+    matrix = vectorizer.fit_transform([c.text for c in chunks])
+    payload = {
+        "created_at": datetime.now().isoformat(timespec="seconds"),
+        "source_dir": str(path.parent),
+        "chunk_words": 220,
+        "overlap_words": 45,
+        "chunks": chunks,
+        "vectorizer": vectorizer,
+        "matrix": matrix,
+    }
+    with path.open("wb") as f:
+        pickle.dump(payload, f)
+    return path
+```
+- [ ] **Step 2: Failing test.**
+`tests/rag/test_clinical_loader.py`:
+```python
+"""Tests for src.rag.clinical.loader."""
+from __future__ import annotations
+from pathlib import Path
+import pytest
+from src.rag.clinical import loader
+from tests.fixtures.build_tiny_clinical_index import build as build_tiny
+class TestLoadIndex:
+    def test_load_returns_payload_with_expected_keys(self, tmp_path: Path) -> None:
+        idx_path = build_tiny(tmp_path / "tiny.pkl")
+        payload = loader.load_index(idx_path)
+        assert {"chunks", "vectorizer", "matrix"} <= set(payload)
+        assert len(payload["chunks"]) == 4
+    def test_missing_index_raises(self, tmp_path: Path) -> None:
+        with pytest.raises(FileNotFoundError, match="clinical RAG index not found"):
+            loader.load_index(tmp_path / "nope.pkl")
+    def test_unique_sources(self, tmp_path: Path) -> None:
+        idx_path = build_tiny(tmp_path / "tiny.pkl")
+        payload = loader.load_index(idx_path)
+        sources = {c.source for c in payload["chunks"]}
+        assert sources == {
+            "alzheimers_lifestyle.pdf", "parkinsons_motor.pdf",
+            "alzheimers_mci.pdf", "parkinsons_nutrition.pdf",
+        }
+```
+Run → ImportError on `src.rag.clinical.loader`.
+- [ ] **Step 3: Minimal impl.**
+`src/rag/clinical/__init__.py`: empty.
+`src/rag/clinical/types.py`:
+```python
+"""Types shared across clinical-RAG modules."""
+from __future__ import annotations
+from dataclasses import dataclass
+from pydantic import BaseModel, Field
+# The user's rag.py uses a frozen dataclass with this exact name and fields.
+# We re-export the same shape so the loader can read pickles produced by
+# either the user's script or our test fixture without translation.
+@dataclass(frozen=True)
+class ClinicalChunk:
+    chunk_id: int
+    source: str
+    page_start: int
+    page_end: int
+    text: str
+class ClinicalEvidence(BaseModel):
+    sentence: str
+    source: str
+    page_start: int
+    page_end: int
+    score: float = Field(..., ge=0.0)
+class ClinicalRetrievalResult(BaseModel):
+    query: str
+    evidence: list[ClinicalEvidence]
+    summary_text: str = Field(..., description="Pre-formatted RAG feedback string for the agent")
+```
+`src/rag/clinical/loader.py`:
+```python
+"""Load (or rebuild) the TF-IDF clinical RAG index."""
+from __future__ import annotations
+import pickle
+from pathlib import Path
+from typing import Any
+from src.core.logger import get_logger
+logger = get_logger(__name__)
+def load_index(path: Path) -> dict[str, Any]:
+    """Unpickle a TF-IDF index produced by the user's rag.py."""
+    path = Path(path)
+    if not path.exists():
+        raise FileNotFoundError(f"clinical RAG index not found: {path}")
+    with path.open("rb") as f:
+        payload = pickle.load(f)
+    if "chunks" not in payload or "vectorizer" not in payload or "matrix" not in payload:
+        raise ValueError(f"clinical RAG index missing expected keys: {sorted(payload)}")
+    logger.info("loaded clinical RAG index: %d chunks from %s", len(payload["chunks"]), path)
+    return payload
+```
+Note: we rely on the source dataclass `Chunk` being importable from where it was pickled. Sklearn is fine with version drift across minor patches; if a major-version drift causes a deserialise error, the user's `rag.py` rebuild is the recovery path (Task 2 wraps that).
+Run tests → 3 passed.
+- [ ] **Step 4:** `pytest -q` no regressions.
+- [ ] **Step 5:** commit: `feat(rag): clinical TF-IDF index loader`.
+---
+### Task 2: Retrieval (TF-IDF query + Turkish/English expansion + evidence picking)
+**Files:**
+- Create: `src/rag/clinical/retrieve.py`
+- Create: `tests/rag/test_clinical_retrieve.py`
+- [ ] **Step 1: Failing test.**
+`tests/rag/test_clinical_retrieve.py`:
+```python
+"""Tests for src.rag.clinical.retrieve."""
+from __future__ import annotations
+from pathlib import Path
+import pytest
+from src.rag.clinical.retrieve import retrieve_clinical
+from src.rag.clinical.loader import load_index
+from tests.fixtures.build_tiny_clinical_index import build as build_tiny
+class TestRetrieve:
+    def test_alzheimer_query_picks_alzheimer_chunks(self, tmp_path: Path) -> None:
+        payload = load_index(build_tiny(tmp_path / "tiny.pkl"))
+        result = retrieve_clinical(payload, query="exercise and Alzheimer's", top_k=2)
+        sources = {ev.source for ev in result.evidence}
+        assert any("alzheimers" in s for s in sources)
+    def test_parkinson_query_picks_parkinson_chunks(self, tmp_path: Path) -> None:
+        payload = load_index(build_tiny(tmp_path / "tiny.pkl"))
+        result = retrieve_clinical(payload, query="Parkinson levodopa", top_k=2)
+        sources = {ev.source for ev in result.evidence}
+        assert any("parkinsons" in s for s in sources)
+    def test_turkish_keyword_routes_via_expansion(self, tmp_path: Path) -> None:
+        # User's rag.py expands "egzersiz" -> "exercise physical activity ...".
+        # Our retrieve must honour the same expansion table so Turkish queries
+        # hit English chunks.
+        payload = load_index(build_tiny(tmp_path / "tiny.pkl"))
+        result = retrieve_clinical(payload, query="egzersiz Alzheimer", top_k=2)
+        # Turkish "egzersiz" + "alzheimer" should pick the lifestyle PDF.
+        assert any("alzheimers_lifestyle" in ev.source for ev in result.evidence)
+    def test_summary_text_contains_citations(self, tmp_path: Path) -> None:
+        payload = load_index(build_tiny(tmp_path / "tiny.pkl"))
+        result = retrieve_clinical(payload, query="diet and Parkinson", top_k=2)
+        # Summary should embed source filenames so the LLM has citations.
+        assert any(ev.source in result.summary_text for ev in result.evidence)
+    def test_empty_query_returns_empty_evidence(self, tmp_path: Path) -> None:
+        payload = load_index(build_tiny(tmp_path / "tiny.pkl"))
+        result = retrieve_clinical(payload, query="", top_k=2)
+        assert result.evidence == []
+```
+Run → ImportError.
+- [ ] **Step 2: Minimal impl.**
+`src/rag/clinical/retrieve.py`:
+```python
+"""TF-IDF retrieval over the clinical-paper corpus, with Turkish→English query expansion."""
+from __future__ import annotations
+import re
+from textwrap import shorten
+from typing import Any
+import numpy as np
+from src.core.logger import get_logger
+from src.rag.clinical.types import ClinicalEvidence, ClinicalRetrievalResult
+logger = get_logger(__name__)
+# Mirrors the table in /Users/mertgungor/Downloads/rag/rag.py so the same
+# Turkish keyword set produces the same expansion in both pipelines.
+_QUERY_EXPANSIONS: dict[str, str] = {
+    "alzheimer": "alzheimer dementia cognitive impairment mild cognitive impairment mci memory",
+    "demans": "dementia alzheimer cognitive impairment memory cognition",
+    "unutkanlik": "memory impairment cognitive decline dementia alzheimer",
+    "parkinson": "parkinson disease movement disorder tremor motor symptoms non motor symptoms",
+    "titreme": "tremor parkinson motor symptoms movement disorder",
+    "egzersiz": "exercise physical activity training aerobic resistance cognition",
+    "beslenme": "nutrition diet lifestyle metabolic risk factors",
+    "risk": "risk factors lifestyle metabolic nutrition prevention",
+    "tani": "diagnosis diagnostic criteria assessment screening",
+    "tedavi": "treatment management therapy intervention",
+}
+def _expand_query(query: str) -> str:
+    normalized = query.casefold()
+    extras = [exp for key, exp in _QUERY_EXPANSIONS.items() if key in normalized]
+    return f"{query} {' '.join(extras)}" if extras else query
+def _split_sentences(text: str) -> list[str]:
+    sentences = re.split(r"(?<=[.!?])\s+", text)
+    return [s.strip() for s in sentences if len(s.split()) >= 6]
+def _query_terms(expanded: str) -> set[str]:
+    return {t for t in re.findall(r"[A-Za-z0-9]+", expanded.lower()) if len(t) >= 4}
+def retrieve_clinical(
+    payload: dict[str, Any],
+    query: str,
+    top_k: int = 5,
+    evidence_limit: int = 5,
+) -> ClinicalRetrievalResult:
+    """Run TF-IDF search over the clinical corpus, return evidence + a feedback summary."""
+    if not query.strip():
+        return ClinicalRetrievalResult(query=query, evidence=[], summary_text="")
+    vectorizer = payload["vectorizer"]
+    matrix = payload["matrix"]
+    chunks = payload["chunks"]
+    expanded = _expand_query(query)
+    qv = vectorizer.transform([expanded])
+    scores = (matrix @ qv.T).toarray().ravel()
+    if not np.any(scores):
+        return ClinicalRetrievalResult(query=query, evidence=[], summary_text="")
+    top_indices = np.argsort(scores)[::-1][:top_k]
+    top_chunks = [(chunks[int(i)], float(scores[int(i)])) for i in top_indices if scores[int(i)] > 0]
+    # Sentence-level evidence picking: pick the highest-overlap sentences first.
+    terms = _query_terms(expanded)
+    candidates: list[tuple[float, str, Any, float]] = []
+    for chunk, chunk_score in top_chunks:
+        for sentence in _split_sentences(chunk.text):
+            sent_terms = set(re.findall(r"[A-Za-z0-9]+", sentence.lower()))
+            overlap = len(terms & sent_terms)
+            if overlap == 0:
+                continue
+            candidates.append((overlap + chunk_score, sentence, chunk, chunk_score))
+    candidates.sort(key=lambda item: item[0], reverse=True)
+    seen: set[str] = set()
+    evidence: list[ClinicalEvidence] = []
+    for _, sent, chunk, sc in candidates:
+        fp = sent[:120].lower()
+        if fp in seen:
+            continue
+        seen.add(fp)
+        evidence.append(ClinicalEvidence(
+            sentence=shorten(sent, width=420, placeholder="..."),
+            source=chunk.source,
+            page_start=chunk.page_start,
+            page_end=chunk.page_end,
+            score=sc,
+        ))
+        if len(evidence) >= evidence_limit:
+            break
+    if not evidence:
+        # Fall back to chunk-level evidence if no sentence overlapped.
+        for chunk, sc in top_chunks[:evidence_limit]:
+            evidence.append(ClinicalEvidence(
+                sentence=shorten(chunk.text, width=420, placeholder="..."),
+                source=chunk.source,
+                page_start=chunk.page_start,
+                page_end=chunk.page_end,
+                score=sc,
+            ))
+    lines = ["Clinical RAG evidence (not a medical diagnosis):"]
+    for ev in evidence:
+        page = (
+            f"p.{ev.page_start}" if ev.page_start == ev.page_end
+            else f"pp.{ev.page_start}-{ev.page_end}"
+        )
+        lines.append(f"- {ev.sentence} [{ev.source}, {page} | score={ev.score:.3f}]")
+    summary = "\n".join(lines)
+    return ClinicalRetrievalResult(query=query, evidence=evidence, summary_text=summary)
+```
+Run tests → 5 passed.
+- [ ] **Step 3:** commit: `feat(rag): TF-IDF clinical retrieval with Turkish/English query expansion`.
+---
+### Task 3: Wire into the agent's `retrieve_context` tool
+**Files:**
+- Modify: `src/agents/tools.py`
+- Modify: `src/agents/prompts.py`
+- Create: `tests/agents/test_tools_clinical_corpus.py`
+- [ ] **Step 1: Failing test.**
+`tests/agents/test_tools_clinical_corpus.py`:
+```python
+"""Tests: retrieve_context tool dispatches by `corpus`."""
+from __future__ import annotations
+from pathlib import Path
+from src.agents.tools import build_default_tools
+from tests.fixtures.build_tiny_clinical_index import build as build_tiny
+class TestClinicalCorpus:
+    def test_corpus_default_is_reference(self, tmp_path: Path) -> None:
+        clinical_idx = build_tiny(tmp_path / "tiny.pkl")
+        tools = {t.name: t for t in build_default_tools(
+            rag_index_dir=None,
+            clinical_rag_index_path=clinical_idx,
+        )}
+        tool = tools["retrieve_context"]
+        # `corpus` not provided → defaults to "reference".
+        out = tool.execute(tool.input_model.model_validate({"query": "test"}))
+        # With rag_index_dir=None, reference returns empty.
+        assert hasattr(out, "chunks")
+    def test_clinical_corpus_returns_evidence(self, tmp_path: Path) -> None:
+        clinical_idx = build_tiny(tmp_path / "tiny.pkl")
+        tools = {t.name: t for t in build_default_tools(
+            rag_index_dir=None,
+            clinical_rag_index_path=clinical_idx,
+        )}
+        tool = tools["retrieve_context"]
+        out = tool.execute(tool.input_model.model_validate({
+            "query": "exercise and Alzheimer",
+            "corpus": "clinical",
+        }))
+        assert len(out.chunks) > 0
+        # Each returned chunk has source + page metadata.
+        for c in out.chunks:
+            assert "source" in c and "text" in c
+```
+Run → fails (signature mismatch).
+- [ ] **Step 2: Wire the tool.**
+In `src/agents/tools.py`, the existing `RetrieveContextInput`/`RetrieveContextOutput` need the `corpus` field. Find the schemas (likely in `src/agents/schemas.py`) and add:
+```python
+from typing import Literal
+class RetrieveContextInput(BaseModel):
+    query: str
+    k: int = 4
+    corpus: Literal["reference", "clinical"] = "reference"
+```
+`RetrieveContextOutput.chunks` already accepts dicts; no change needed there.
+In `src/agents/tools.py`, add a `clinical_rag_index_path: Path | None = None` parameter to `build_default_tools`. Update `_make_retrieve_executor` to take both index sources:
+```python
+def _make_retrieve_executor(
+    rag_index_dir: Path | None,
+    clinical_rag_index_path: Path | None,
+) -> Callable[[RetrieveContextInput], RetrieveContextOutput]:
+    # Lazily load the clinical payload at first use, cache for subsequent calls.
+    clinical_cache: dict[str, Any] = {}
+    def execute(inp: RetrieveContextInput) -> RetrieveContextOutput:
+        if inp.corpus == "clinical":
+            if clinical_rag_index_path is None:
+                logger.warning("retrieve_context corpus=clinical but no index path configured")
+                return RetrieveContextOutput(chunks=[])
+            if "payload" not in clinical_cache:
+                from src.rag.clinical.loader import load_index
+                clinical_cache["payload"] = load_index(clinical_rag_index_path)
+            from src.rag.clinical.retrieve import retrieve_clinical
+            result = retrieve_clinical(clinical_cache["payload"], inp.query, top_k=inp.k)
+            return RetrieveContextOutput(chunks=[
+                {
+                    "source": ev.source,
+                    "page_start": ev.page_start,
+                    "page_end": ev.page_end,
+                    "text": ev.sentence,
+                    "score": ev.score,
+                }
+                for ev in result.evidence
+            ])
+        # corpus == "reference" — existing FAISS path. Keep current behaviour.
+        ... (preserve existing executor body) ...
+    return execute
+```
+In `src/api/routes.py`, where `build_default_tools(...)` is called inside `_build_orchestrator()` (around line 577), pass the new path:
+```python
+clinical_idx = Path(os.environ.get(
+    "CLINICAL_RAG_INDEX_PATH",
+    "data/external_rag/index/rag_index.pkl",
+))
+tools = build_default_tools(
+    rag_index_dir=rag_dir if rag_status["exists"] else None,
+    clinical_rag_index_path=clinical_idx if clinical_idx.exists() else None,
+)
+```
+Update `src/agents/prompts.py` `retrieve_context` description (already mentions FAISS) — adapt to:
+```
+- retrieve_context: retrieve up to k passages from a knowledge base. Pass corpus="clinical" for medical-paper evidence (Alzheimer's / Parkinson's / lifestyle / nutrition; supports Turkish keywords); default corpus="reference" for the curated FAISS index.
+```
+- [ ] **Step 3:** `pytest -q` → 2 new tests + previous baseline + retrieve regressions checked.
+- [ ] **Step 4:** commit: `feat(agents): retrieve_context corpus dispatch (reference vs clinical)`.
+---
+### Task 4: README + CLI sanity
+**Files:**
+- Modify: `README.md`
+- Create: `scripts/clinical_rag_smoke.py`
+- [ ] **Step 1:** small CLI tool to demo the corpus from the terminal:
+```python
+"""Smoke: ask the clinical corpus a question from the terminal.
+Usage:
+    python scripts/clinical_rag_smoke.py "egzersiz Alzheimer feedback"
+"""
+from __future__ import annotations
+import sys
+from pathlib import Path
+from src.rag.clinical.loader import load_index
+from src.rag.clinical.retrieve import retrieve_clinical
+def main() -> None:
+    if len(sys.argv) < 2:
+        print(__doc__)
+        sys.exit(1)
+    query = " ".join(sys.argv[1:])
+    payload = load_index(Path("data/external_rag/index/rag_index.pkl"))
+    result = retrieve_clinical(payload, query, top_k=5, evidence_limit=5)
+    print(result.summary_text or "(no matches)")
+if __name__ == "__main__":
+    main()
+```
+- [ ] **Step 2:** README addition (find the existing RAG section, append):
+```markdown
+### Clinical Corpus (TF-IDF, Turkish + English)
+A second, lightweight RAG index covers 14 medical PDFs (Alzheimer's, Parkinson's, lifestyle, nutrition, exercise) using TF-IDF + sklearn. Source PDFs live at `data/external_rag/clinical_pdfs/` (gitignored — copy from your team's shared drive). Pre-built index at `data/external_rag/index/rag_index.pkl`.
+Agent invocation:
+```python
+retrieve_context(query="egzersiz Alzheimer feedback", corpus="clinical", k=5)
+```
+Local CLI smoke:
+```bash
+python scripts/clinical_rag_smoke.py "egzersiz Alzheimer feedback"
+```
+The Turkish keywords `alzheimer`, `parkinson`, `egzersiz`, `beslenme`, `tani`, `tedavi`, `risk`, `unutkanlik`, `titreme`, `demans` auto-expand to English equivalents so Turkish queries hit English chunks.
+```
+- [ ] **Step 3:** commit: `docs(rag): document clinical TF-IDF corpus + add CLI smoke`.
+---
+## Self-review
+1. **Spec coverage.** User said "fully integrate the new RAG folder". The wrapper imports the user's exact pickle schema, mirrors the Turkish expansion table verbatim, and surfaces the same evidence semantics (citation per sentence, source+page tags). ✓
+2. **Backward compatibility.** Default corpus is `"reference"`, preserving existing FAISS behaviour. ✓
+3. **No XAI / re-embedding.** The user's TF-IDF index is used as-is. We don't re-embed or fine-tune. ✓
+4. **Independence.** No coupling to BBB, fusion, or MRI modules. ✓
+5. **No placeholders.** Every step has the full code or full diff direction. ✓
+---
+## Execution handoff
+Save and choose: subagent-driven (recommended) or inline executing-plans.