diff --git a/.env.example b/.env.example index d374e4cfef2901f7c4da1b81293864fca2e0aae3..4e73b78c5aa0f837369295aa9d2dc39f8ee18412 100644 --- a/.env.example +++ b/.env.example @@ -20,3 +20,11 @@ POLYGUARD_FRONTIER_MODEL=Qwen/Qwen2.5-7B-Instruct POLYGUARD_ALLOW_WEB_FETCH=false POLYGUARD_REWARD_MIN=0.001 POLYGUARD_REWARD_MAX=0.999 + +# --- Medication alternatives tool (FDA openFDA + optional external CDS) --- +# Optional: higher openFDA rate limits — request a key at https://open.fda.gov/apis/authentication/ +# POLYGUARD_OPENFDA_API_KEY= +# Optional: POST { "drug_names": ["..."] } to your service; Bearer token if required (Tally/Vellum/custom). +# Never commit real tokens; set in Space secrets or local .env only. +# POLYGUARD_MED_TOOL_URL= +# POLYGUARD_MED_TOOL_TOKEN= diff --git a/.gitattributes b/.gitattributes index 22bbdfd704055658b8a263c6a4f10e6a800b9f5b..acd25ce27fe162b7c5d318bc882d7c3cc9d72f47 100644 --- a/.gitattributes +++ b/.gitattributes @@ -41,3 +41,4 @@ docs/results/submission_evidence/qwen_0_5b_1_5b/reward_component_bars.png filter docs/results/submission_evidence/qwen_0_5b_1_5b_3b/reward_component_bars.png filter=lfs diff=lfs merge=lfs -text docs/results/submission_evidence_qwen_0_5b_1_5b/charts/generated/reward_component_bars.png filter=lfs diff=lfs merge=lfs -text docs/results/submission_evidence_qwen_0_5b_1_5b_3b/charts/generated/reward_component_bars.png filter=lfs diff=lfs merge=lfs -text +submission_bundle/qwen_completed_runs/charts/generated/reward_component_bars.png filter=lfs diff=lfs merge=lfs -text diff --git a/Dockerfile b/Dockerfile index 98d1116f0afd6bb4c961509d865c140dcae6e78d..5b954253e295d3129761344a7bab37e1e70ed1e0 100644 --- a/Dockerfile +++ b/Dockerfile @@ -1,6 +1,6 @@ -# Hugging Face Space: single-port edge (nginx) + OpenEnv (8100) + API (8200) + static UI. -# Build from repository root: docker build -f Dockerfile.space -t polyguard-space . -# Cheap tier: use Space "CPU basic"; first boot downloads ~1.1GB model bundle. +# Hugging Face Space: nginx on PORT (7860) + OpenEnv (8100) + API (8200) + Vite-built UI. +# Build: docker build -t polyguard-space . +# HF Spaces use this file by default when "Dockerfile path" is unset — keep this as the demo image. FROM node:20-bookworm-slim AS frontend WORKDIR /build diff --git a/Dockerfile.space b/Dockerfile.space index 98d1116f0afd6bb4c961509d865c140dcae6e78d..485e736bb5344914632242f09a77f4e61566c1f6 100644 --- a/Dockerfile.space +++ b/Dockerfile.space @@ -1,6 +1,5 @@ -# Hugging Face Space: single-port edge (nginx) + OpenEnv (8100) + API (8200) + static UI. -# Build from repository root: docker build -f Dockerfile.space -t polyguard-space . -# Cheap tier: use Space "CPU basic"; first boot downloads ~1.1GB model bundle. +# Same image as ./Dockerfile — use this path in HF Space settings if "Dockerfile path" +# must be explicit (e.g. Dockerfile.space). Keep in sync with Dockerfile. FROM node:20-bookworm-slim AS frontend WORKDIR /build diff --git a/README.md b/README.md index 40afe389f19f24b8469e0d01ceb34e93bcabe752..89f0c4b8357d9131143777a60672af480326b02b 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,48 @@ --- title: PolyGuard OpenEnv -emoji: 🛡️ colorFrom: blue -colorTo: purple +colorTo: green sdk: docker app_port: 7860 pinned: false -license: mit --- -Full-stack **PolyGuard** workbench: OpenEnv (WebSocket), FastAPI, and React UI behind nginx on `PORT`. Uses **CPU basic**; first cold start downloads the public [usable model bundle](https://huggingface.co/TheJackBright/polyguard-openenv-training-full-artifacts/tree/main/usable_model_bundles/local-qwen-0-5b-active-smoke) (~1.1 GB). See `docker/space/README.md` for details. +# PolyGuard (OpenEnv implementation package) + +Run all CLI commands from this directory (`cd polyguard-rl`). The repository root [`README.md`](../README.md) carries the same submission narrative with paths adjusted for viewers landing on the GitHub repo home page. + +## Submission Links + +- GitHub Repo URL: [https://github.com/Vishwa-docs/Meta_Pytorch_OpenEnv_Scaler_VK](https://github.com/Vishwa-docs/Meta_Pytorch_OpenEnv_Scaler_VK) +- HF Space URL: [https://huggingface.co/spaces/TheJackBright/polyguard-openenv](https://huggingface.co/spaces/TheJackBright/polyguard-openenv) +- Colab Notebook URL: [https://colab.research.google.com/github/Vishwa-docs/Meta_Pytorch_OpenEnv_Scaler_VK/blob/master/polyguard-rl/PolyGuard_SFT_GRPO_One_Run_Runner.ipynb](https://colab.research.google.com/github/Vishwa-docs/Meta_Pytorch_OpenEnv_Scaler_VK/blob/master/polyguard-rl/PolyGuard_SFT_GRPO_One_Run_Runner.ipynb) (see also `notebooks/09_training_loop.ipynb` for a modular training walkthrough) +- YouTube Video URL: not used for this submission; see Hugging Face Blog URL below. +- Hugging Face Blog URL: [https://huggingface.co/blog/TheJackBright/polyguard-openenv](https://huggingface.co/blog/TheJackBright/polyguard-openenv) *(publish `docs/hf_blog_draft.md` or replace with a live story URL)* + +## Problem Statement + +Polypharmacy decisions are long-horizon, partially observable, and safety-critical. PolyGuard is a research environment where an LLM agent selects constrained clinical actions, receives verifier-backed reward, and improves via SFT + GRPO—not generic open-ended chat fine-tuning. + +## Environment + +`PolyGuardEnv` exposes OpenEnv-style HTTP/WebSocket endpoints (`/reset`, `/step`, `/state`, `/metadata`, `/schema`, `/mcp`, `/health`, `/ws`). Sub-environments include DDI, bandit mining, regimen risk, precision dosing, longitudinal deprescribing, web-search missing data, alternative suggestion, and new-drug decomposition. See `openenv.yaml`, `app/env/env_core.py`, `app/env/fastapi_app.py`, and `docs/environment_design.md`. + +## Agent Capabilities + +Medication reconciliation, evidence retrieval, graph safety, dosing guardrails, candidate generation, supervisor routing, planner/critic stack, explanations, and contextual bandit ranking for ablations (`app/agents/`, `docs/agents.md`). + +## Tasks + +DDI risk reduction, safe adds/substitutions, regimen optimization, taper/deprescribing sequences, precision dosing, missing-data recovery, and new-drug decomposition (`data/scenarios/`, `app/env/catalog.py`). + +## Reward Model / Evaluation Logic + +Thirteen verifier-backed reward components roll up into four primary channels (`safety_legality`, `clinical_improvement`, `dosing_quality`, `process_integrity`), clamped to `[0.001, 0.999]`, with anti-cheat and timeout logic (`app/env/reward_router.py`, `app/env/anti_cheat.py`, `docs/reward_design.md`). + +## Training And Post-Training Strategy + +Build corpora (`scripts/bootstrap_data.py`, `scripts/build_training_corpus.py`), SFT with TRL (`scripts/train_sft_trl.py`), GRPO with environment reward (`scripts/train_grpo_trl.py`), merge adapters (`scripts/merge_adapters_safe.py`), validate inference (`scripts/test_inference_postsave.py`), evaluate and plot (`scripts/evaluate_*.py`, `docs/results/`). Optional HF GPU training: `scripts/deploy_training_space.py`. Full commands: repository root [`README.md`](../README.md) or `docs/training.md`. + +## Documentation index + +- [Architecture](docs/architecture.md) · [Environment](docs/environment_design.md) · [Rewards](docs/reward_design.md) · [Training](docs/training.md) · [Evaluation](docs/evaluation.md) · [Deployment](docs/deployment.md) · [Datasets](docs/datasets.md) · [Participant guide traceability](docs/participant_guide_traceability.md) · [Idea doc vs implementation](docs/idea_document_traceability.md) · [**Space UI demo script**](docs/DEMO_RECORDING_SCRIPT.md) diff --git a/app/api/routes.py b/app/api/routes.py index 8460a5663d2fa04d105e1a4caa476cd2756af355..4dad482ee610741f190a3b33bafc7f79bf02dfe9 100644 --- a/app/api/routes.py +++ b/app/api/routes.py @@ -5,9 +5,11 @@ from __future__ import annotations from fastapi import APIRouter, Depends, HTTPException from app.api.dependencies import get_service +from app.tools.medication_alternatives import build_alternatives_response from app.api.schemas import ( BatchInferRequest, EvidenceQueryRequest, + MedicationAlternativesRequest, OrchestrateRequest, ResetRequest, StepCandidateRequest, @@ -137,3 +139,13 @@ def cases_search(q: str, service: APIService = Depends(get_service)) -> list[dic @router.post("/evidence/query") def evidence_query(payload: EvidenceQueryRequest, service: APIService = Depends(get_service)) -> list[dict]: return service.evidence_query(query=payload.query, top_k=payload.top_k) + + +@router.post("/tools/medication_alternatives") +def medication_alternatives(payload: MedicationAlternativesRequest) -> dict: + """OpenFDA class neighbors + optional external POST (env: POLYGUARD_MED_TOOL_URL / TOKEN).""" + return build_alternatives_response( + query_drug=payload.query_drug, + regimen_drugs=payload.regimen_drugs, + max_suggestions=payload.max_suggestions, + ) diff --git a/app/api/schemas.py b/app/api/schemas.py index b1950a6962acac49d94bc8ef99e9b894c7621c4c..6207dd0575eac0463f27561e85b6dae6af23ff01 100644 --- a/app/api/schemas.py +++ b/app/api/schemas.py @@ -55,3 +55,11 @@ class BatchInferRequest(StrictSchema): class EvidenceQueryRequest(StrictSchema): query: str top_k: int = 5 + + +class MedicationAlternativesRequest(StrictSchema): + """FDA / external tool: suggest other labeled drugs in a similar pharmacologic class.""" + + query_drug: Optional[str] = None + regimen_drugs: list[str] = Field(default_factory=list) + max_suggestions: int = Field(default=10, ge=1, le=25) diff --git a/app/tools/__init__.py b/app/tools/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..f9ae85589aea9a102a16bf8bd7013a3b61410b84 --- /dev/null +++ b/app/tools/__init__.py @@ -0,0 +1 @@ +"""Optional product tools (FDA search, external CDS hooks).""" diff --git a/app/tools/medication_alternatives.py b/app/tools/medication_alternatives.py new file mode 100644 index 0000000000000000000000000000000000000000..6dc22ef793bdcf75e883841ea3794054eb67aa56 --- /dev/null +++ b/app/tools/medication_alternatives.py @@ -0,0 +1,463 @@ +"""OpenFDA-backed medication class search + optional external HTTP tool. + +Secrets (OpenFDA key, Tally/Vellum/custom bearer tokens) must come from env only. +""" + +from __future__ import annotations + +import logging +import os +import re +from typing import Any +from urllib.parse import quote + +import requests + +logger = logging.getLogger(__name__) + +OPENFDA_LABEL = "https://api.fda.gov/drug/label.json" +_DEFAULT_DISCLAIMER = ( + "Research aid only — not medical advice. FDA labels may be incomplete; verify in approved prescribing information." +) + + +def _openfda_key_suffix() -> str: + key = os.getenv("POLYGUARD_OPENFDA_API_KEY", "").strip() + if not key: + return "" + return f"&api_key={quote(key, safe='')}" + + +def _fda_get(search: str, limit: int) -> dict[str, Any] | None: + """GET openFDA label.json; returns parsed JSON or None on failure.""" + q = quote(search, safe="") + url = f"{OPENFDA_LABEL}?search={q}&limit={int(limit)}{_openfda_key_suffix()}" + try: + resp = requests.get(url, timeout=14) + if resp.status_code != 200: + logger.warning("openfda_http_%s: %s", resp.status_code, resp.text[:200]) + return None + return resp.json() + except Exception as exc: # noqa: BLE001 + logger.warning("openfda_request_failed: %s", exc) + return None + + +def _first_openfda(payload: dict[str, Any] | None) -> dict[str, Any]: + if not payload or "results" not in payload: + return {} + results = payload.get("results") + if not isinstance(results, list) or not results: + return {} + first = results[0] + return first if isinstance(first, dict) else {} + + +def _openfda_block(label: dict[str, Any]) -> dict[str, Any]: + block = label.get("openfda") + return block if isinstance(block, dict) else {} + + +def _listify(value: Any) -> list[str]: + if value is None: + return [] + if isinstance(value, str): + return [value] + if isinstance(value, list): + return [str(x).strip() for x in value if str(x).strip()] + return [str(value).strip()] + + +def _snippet(text: Any, max_len: int = 380) -> str | None: + if not text: + return None + if isinstance(text, list): + text = " ".join(str(x) for x in text[:6]) + s = re.sub(r"\s+", " ", str(text)).strip() + if len(s) <= max_len: + return s + return s[: max_len - 1] + "…" + + +def _label_link(set_id: str | None) -> str | None: + if not set_id: + return None + return f"https://dailymed.nlm.nih.gov/dailymed/drugInfo.cfm?setid={set_id}" + + +# Keywords from free text / simulator tokens → openFDA pharm_class_epc strings (exact or prefix). +_KEYWORD_EPCS: tuple[tuple[str, tuple[str, ...]], ...] = ( + ("benzodiazepine", ("Benzodiazepine", "Benzodiazepine Sedative")), + ("benzo", ("Benzodiazepine",)), + ("nsaid", ("Nonsteroidal Anti-inflammatory Drug",)), + ("opioid", ("Opioid Agonist", "Full Opioid Agonists")), + ("statin", ("HMG-CoA Reductase Inhibitor",)), + ("beta blocker", ("beta-Adrenergic Blocker",)), + ("betablocker", ("beta-Adrenergic Blocker",)), + ("ace inhibitor", ("Angiotensin-converting Enzyme Inhibitor",)), + ("arb", ("Angiotensin II Receptor Blocker",)), + ("ppi", ("Proton Pump Inhibitor",)), + ("ssri", ("Selective Serotonin Reuptake Inhibitor",)), + # Anticoagulant / antiplatelet (simulator warfarin_like → warfarin) + ("warfarin", ("Vitamin K Antagonist",)), + ("heparin", ("Thrombin Inhibitor", "Factor Xa Inhibitor")), +) + + +def _normalize_simulator_query(q: str) -> str: + """Strip simulator suffixes and underscores so benzodiazepine_like → benzodiazepine.""" + raw = q.strip().lower()[:120] + if not raw: + return "" + for suf in ("_like", "_analog", "_analogue", "_class", "_group", "_category"): + if raw.endswith(suf): + raw = raw[: -len(suf)].strip("_").strip() + return raw.replace("_", " ").strip() + + +def _class_search_variants(focus: str) -> list[str]: + """Ordered strings to try as openFDA pharm_class_epc (exact quoted) or wildcard body.""" + raw = _normalize_simulator_query(focus) + if not raw: + return [] + out: list[str] = [] + seen: set[str] = set() + + def add(s: str) -> None: + t = s.strip() + if len(t) < 3: + return + k = t.lower() + if k in seen: + return + seen.add(k) + out.append(t) + + compact = raw.replace(" ", "") + # Prefer canonical FDA class strings before raw lowercase (better labels + display). + for kw, epcs in _KEYWORD_EPCS: + if kw in compact or kw in raw: + for e in epcs: + add(e) + add(raw) + first = raw.split()[0] + if first != raw: + add(first) + if raw and " " not in raw and raw.isalpha(): + add(raw[0].upper() + raw[1:]) + return out + + +def _resolve_focus_drug(query_drug: str | None, regimen_drugs: list[str]) -> str: + """Prefer explicit query_drug from client; do not silently use regimen[0] when multiple rows exist.""" + q = (query_drug or "").strip() + if q: + return q + if len(regimen_drugs) == 1: + t = str(regimen_drugs[0]).strip() + return t + # Multiple regimen drugs but no focus: caller should send query_drug (frontend bug otherwise). + return "" + + +def _escape_fda_term(term: str) -> str: + """Remove characters that break openFDA quoted search.""" + return re.sub(r'["\\]', " ", term).strip()[:100] + + +def _search_label_for_name(name: str) -> dict[str, Any]: + """Search brand, generic, or active substance on SPL labels.""" + n = _escape_fda_term(name.strip()[:80]) + if not n: + return {} + data_g = _fda_get(f'openfda.generic_name:"{n}"', limit=3) + if data_g and data_g.get("results"): + return _first_openfda(data_g) + data_b = _fda_get(f'openfda.brand_name:"{n}"', limit=3) + if data_b and data_b.get("results"): + return _first_openfda(data_b) + # Active ingredient / substance (helps real drug stems) + data_s = _fda_get(f'openfda.substance_name:"{n}"', limit=3) + if data_s and data_s.get("results"): + return _first_openfda(data_s) + data_a = _fda_get(f'openfda.active_ingredient:"{n}"', limit=3) + return _first_openfda(data_a) if data_a else {} + + +def _suggestions_by_class_probe( + field: str, + class_value: str, + exclude: set[str], + max_suggestions: int, +) -> list[dict[str, Any]]: + rows = _suggestions_for_class(field, class_value, exclude, max_suggestions) + if rows: + return rows + # Wildcard: openFDA supports *suffix / prefix* on some fields + body = _escape_fda_term(class_value).lower() + if len(body) >= 4: + wild = _fda_get(f"openfda.{field}:*{body}*", limit=min(40, max(10, max_suggestions * 4))) + if wild and wild.get("results"): + # Reuse list builder by synthesizing a narrowed class is awkward; parse manually + out: list[dict[str, Any]] = [] + seen: set[str] = set() + for row in wild.get("results", []): + if not isinstance(row, dict): + continue + of = _openfda_block(row) + brands = _listify(of.get("brand_name")) + generics = _listify(of.get("generic_name")) + display = (brands[0] if brands else None) or (generics[0] if generics else None) + if not display: + continue + key = display.lower() + if key in seen or key in exclude: + continue + seen.add(key) + ar = row.get("adverse_reactions") + ar_text = ar[0] if isinstance(ar, list) and ar else ar + set_id = None + if isinstance(of.get("spl_set_id"), list) and of["spl_set_id"]: + set_id = str(of["spl_set_id"][0]) + elif of.get("spl_set_id"): + set_id = str(of["spl_set_id"]) + out.append( + { + "display_name": display, + "generic_names": generics[:4], + "brand_names": brands[:4], + "routes": _listify(of.get("route"))[:4], + "adverse_reactions_snippet": _snippet(ar_text), + "label_link": _label_link(set_id), + "source_detail": f"openfda.{field}.wildcard", + }, + ) + if len(out) >= max_suggestions: + break + return out + return [] + + +def _pick_pharm_class(openfda_block: dict[str, Any]) -> tuple[str | None, str | None]: + for key in ("pharm_class_epc", "pharm_class_cs", "pharm_class_moa"): + for item in _listify(openfda_block.get(key)): + if len(item) > 3: + return key, item + return None, None + + +def _suggestions_for_class( + field: str, + pharm_class: str, + exclude: set[str], + max_suggestions: int, +) -> list[dict[str, Any]]: + """List other drugs sharing FDA pharmacologic class on label.""" + pc = pharm_class.strip()[:120] + if not pc or not field: + return [] + search = f'openfda.{field}:"{pc}"' + data = _fda_get(search, limit=min(50, max(10, max_suggestions * 4))) + if not data or not data.get("results"): + return [] + + out: list[dict[str, Any]] = [] + seen: set[str] = set() + for row in data.get("results", []): + if not isinstance(row, dict): + continue + of = _openfda_block(row) + brands = _listify(of.get("brand_name")) + generics = _listify(of.get("generic_name")) + display = (brands[0] if brands else None) or (generics[0] if generics else None) + if not display: + continue + key = display.lower() + if key in seen: + continue + if key in exclude: + continue + seen.add(key) + ar = row.get("adverse_reactions") + if isinstance(ar, list) and ar: + ar_text = ar[0] + else: + ar_text = ar + set_id = None + if isinstance(of.get("spl_set_id"), list) and of["spl_set_id"]: + set_id = str(of["spl_set_id"][0]) + elif of.get("spl_set_id"): + set_id = str(of["spl_set_id"]) + out.append( + { + "display_name": display, + "generic_names": generics[:4], + "brand_names": brands[:4], + "routes": _listify(of.get("route"))[:4], + "adverse_reactions_snippet": _snippet(ar_text), + "label_link": _label_link(set_id), + "source_detail": f"openfda.{field}", + } + ) + if len(out) >= max_suggestions: + break + return out + + +def _external_suggestions(drug_names: list[str]) -> list[dict[str, Any]] | None: + url = os.getenv("POLYGUARD_MED_TOOL_URL", "").strip() + if not url: + return None + headers: dict[str, str] = {"Content-Type": "application/json"} + token = os.getenv("POLYGUARD_MED_TOOL_TOKEN", "").strip() + if token: + headers["Authorization"] = f"Bearer {token}" + try: + resp = requests.post( + url, + json={"drug_names": drug_names}, + headers=headers, + timeout=18, + ) + if resp.status_code >= 400: + logger.warning("med_tool_http_%s", resp.status_code) + return [] + payload = resp.json() + except Exception as exc: # noqa: BLE001 + logger.warning("med_tool_request_failed: %s", exc) + return [] + if not isinstance(payload, dict): + return [] + raw = payload.get("suggestions") + if not isinstance(raw, list): + return [] + cleaned: list[dict[str, Any]] = [] + for item in raw: + if isinstance(item, dict) and item.get("display_name"): + row = dict(item) + row["source_detail"] = str(row.get("source_detail") or "external_tool") + cleaned.append(row) + elif isinstance(item, str) and item.strip(): + cleaned.append( + { + "display_name": item.strip(), + "generic_names": [], + "brand_names": [], + "routes": [], + "adverse_reactions_snippet": None, + "label_link": None, + "source_detail": "external_tool", + } + ) + return cleaned + + +def build_alternatives_response( + query_drug: str | None, + regimen_drugs: list[str], + max_suggestions: int, +) -> dict[str, Any]: + errors: list[str] = [] + regimen_clean = [str(x).strip() for x in regimen_drugs if str(x).strip()][:40] + focus = _resolve_focus_drug(query_drug, regimen_clean) + exclude = {x.lower() for x in regimen_clean} + if focus: + exclude.add(focus.lower()) + + external_rows: list[dict[str, Any]] = [] + ext = _external_suggestions([focus] if focus else regimen_clean[:5]) + if ext is not None: + external_rows = ext + + if not focus and not regimen_clean: + return { + "focus_drug": "", + "therapeutic_class": None, + "suggestions": external_rows, + "source": "external" if external_rows else "none", + "disclaimer": _DEFAULT_DISCLAIMER, + "errors": ["Enter a drug name or load drugs from the current episode."], + } + + if not focus and regimen_clean: + return { + "focus_drug": "", + "therapeutic_class": None, + "therapeutic_class_field": None, + "suggestions": external_rows, + "source": "external" if external_rows else "none", + "disclaimer": _DEFAULT_DISCLAIMER, + "errors": [ + "Several medications are on this regimen; pick a focus row in the UI (or pass query_drug). " + "The server does not guess the first medication anymore.", + ], + } + + # SPL name/substance search: normalize simulator tokens first (benzodiazepine_like → benzodiazepine). + lookup = _normalize_simulator_query(focus) or focus.strip() + label = _search_label_for_name(lookup) + ofb = _openfda_block(label) + pharm_field, pharm = _pick_pharm_class(ofb) + + openfda_rows: list[dict[str, Any]] = [] + if pharm and pharm_field: + openfda_rows = _suggestions_for_class(pharm_field, pharm, exclude, max_suggestions) + if not openfda_rows: + # Simulator tokens (e.g. benzodiazepine_like) or class keywords: try FDA class directly. + for cand in _class_search_variants(focus): + rows = _suggestions_by_class_probe("pharm_class_epc", cand, exclude, max_suggestions) + if rows: + pharm_field, pharm = "pharm_class_epc", cand + openfda_rows = rows + break + if not openfda_rows: + for cand in _class_search_variants(focus): + rows = _suggestions_by_class_probe("pharm_class_cs", cand, exclude, max_suggestions) + if rows: + pharm_field, pharm = "pharm_class_cs", cand + openfda_rows = rows + break + + if not openfda_rows: + if not (pharm and pharm_field): + errors.append( + "Could not match this text to an FDA SPL (generic/brand/substance) or pharmacologic class. " + "Try a generic name (e.g. diazepam), a class keyword (e.g. benzodiazepine), or load from episode.", + ) + elif not external_rows: + errors.append( + "No labeled products returned for this query (try another spelling or a broader class keyword).", + ) + + merged: list[dict[str, Any]] = [] + seen_keys: set[str] = set() + for row in external_rows + openfda_rows: + display = str(row.get("display_name", "")).strip() + if not display: + continue + generics = [str(g).lower() for g in (row.get("generic_names") or []) if g] + dedupe_key = generics[0] if generics else display.lower() + if dedupe_key in seen_keys: + continue + seen_keys.add(dedupe_key) + merged.append(row) + if len(merged) >= max_suggestions: + break + + source = "openfda" + if external_rows and openfda_rows: + source = "mixed" + elif external_rows and not openfda_rows: + source = "external" + elif not external_rows and not openfda_rows: + source = "none" + + return { + "focus_drug": focus, + "therapeutic_class": pharm, + "therapeutic_class_field": pharm_field, + "suggestions": merged, + "source": source, + "disclaimer": _DEFAULT_DISCLAIMER, + "errors": errors, + } diff --git a/app/ui/frontend/src/App.tsx b/app/ui/frontend/src/App.tsx index 433a0c63723740417d35bb239320a4987176c58c..6434cc7b3ab52e3e790269ab62d985e72a1313a0 100644 --- a/app/ui/frontend/src/App.tsx +++ b/app/ui/frontend/src/App.tsx @@ -20,6 +20,7 @@ import type { StepResponse, TaskPreset, } from "./lib/types"; +import AlternativeMedicineSearch from "./components/AlternativeMedicineSearch"; import MetaverseBackdrop from "./components/MetaverseBackdrop"; type WorkbenchMode = "agent" | "env"; @@ -887,6 +888,18 @@ export default function App() { const activeInfo = mode === "agent" ? agentInfo : envInfo; const activeTerminationReason = shortValue(activeInfo?.termination_reason); const terminationReason = activeTerminationReason !== "-" ? activeTerminationReason : null; + const regimenForAltTool = useMemo(() => { + const meds = activeObservation?.medication_table ?? []; + const names: string[] = []; + for (const row of meds) { + const v = row.drug ?? row.drug_id ?? row.name; + if (typeof v === "string" && v.trim()) { + names.push(v.trim()); + } + } + return names; + }, [activeObservation]); + const heroStats: Array<[string, string]> = [ ["Runtime", mode === "agent" ? "Agent Workbench" : "Env Explorer"], ["Scenario", taskLabel(taskId, catalog.task_presets)], @@ -1164,6 +1177,7 @@ export default function App() { + (null); + const [result, setResult] = useState(null); + + useEffect(() => { + if (regimenDrugNames.length === 0) { + setRegimenFocusIndex(0); + return; + } + setRegimenFocusIndex((prev) => (prev >= regimenDrugNames.length ? 0 : prev)); + }, [regimenDrugNames]); + + const runSearch = useCallback( + async (queryDrug: string | undefined, regimen: string[]) => { + setLoading(true); + setError(null); + try { + const res = await fetch(`${API_BASE}/tools/medication_alternatives`, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify({ + query_drug: queryDrug?.trim() || null, + regimen_drugs: regimen, + max_suggestions: 7, + }), + }); + if (!res.ok) { + const t = await res.text(); + throw new Error(t.slice(0, 200) || `HTTP ${res.status}`); + } + setResult((await res.json()) as AlternativesResponse); + } catch (e) { + setResult(null); + setError(e instanceof Error ? e.message : "Request failed"); + } finally { + setLoading(false); + } + }, + [], + ); + + const safeRegimenIndex = + regimenDrugNames.length > 0 + ? Math.min(Math.max(regimenFocusIndex, 0), regimenDrugNames.length - 1) + : 0; + + /** Never send null focus when a regimen exists — avoids API defaulting to regimen[0] (always benzo if first). */ + const resolvedFocusDrug = (): string | undefined => { + const typed = query.trim(); + const fromList = regimenDrugNames[safeRegimenIndex]?.trim() ?? ""; + if (focusFromRegimenSelect && regimenDrugNames.length > 0) { + return fromList || typed || undefined; + } + return typed || fromList || undefined; + }; + + const onSubmit = () => { + void runSearch(resolvedFocusDrug(), regimenDrugNames); + }; + + const onLoadRegimen = () => { + const names = regimenDrugNames.length ? regimenDrugNames : []; + if (!names.length) { + setError("Reset an episode first so the regimen list is available."); + return; + } + const idx = Math.min(Math.max(regimenFocusIndex, 0), names.length - 1); + const focus = names[idx] ?? ""; + setRegimenFocusIndex(idx); + setQuery(focus); + setFocusFromRegimenSelect(true); + void runSearch(focus, names); + }; + + const onRegimenSelectChange = (index: number) => { + setRegimenFocusIndex(index); + const name = regimenDrugNames[index]?.trim() ?? ""; + setQuery(name); + setFocusFromRegimenSelect(true); + }; + + return ( +
+
+

FDA alternatives

+ Tool +
+ {regimenDrugNames.length > 0 ? ( + + ) : null} +
+ +
+ + +
+
+

+ Pick a regimen row, then search. Up to 7 results — scroll the list below. +

+ {error &&
{error}
} + {result && ( +
+ {result.errors?.length ? ( +
    + {result.errors.map((msg) => ( +
  • {msg}
  • + ))} +
+ ) : null} +

+ Focus: {result.focus_drug || "—"} · Class:{" "} + {result.therapeutic_class ?? "—"}{" "} + {result.therapeutic_class_field ? ({result.therapeutic_class_field}) : null} ·{" "} + Source: {result.source} +

+
+
    + {result.suggestions?.length ? ( + result.suggestions.map((s, idx) => ( +
  • +
    + {s.display_name} + · {s.source_detail ?? "openfda"} +
    + {s.routes?.length ? ( +
    Route: {s.routes.join(", ")}
    + ) : null} + {s.generic_names?.length ? ( +
    Generic: {s.generic_names.join(", ")}
    + ) : null} + {s.adverse_reactions_snippet ? ( +
    ADR label excerpt: {s.adverse_reactions_snippet}
    + ) : null} + {s.label_link ? ( + + DailyMed / label + + ) : null} +
  • + )) + ) : ( +
  • No suggestions yet — try another spelling or load from episode.
  • + )} +
+
+
+ )} +
+ ); +} diff --git a/app/ui/frontend/src/styles/theme.css b/app/ui/frontend/src/styles/theme.css index 8914c0f8da1110037029692b4a9206779d3a049b..b39f537ae8a67d27c95b56ac7497be98f2320961 100644 --- a/app/ui/frontend/src/styles/theme.css +++ b/app/ui/frontend/src/styles/theme.css @@ -1138,6 +1138,108 @@ td { } } +.small-print { + font-size: 0.78rem; + line-height: 1.35; +} + +.alt-med-tool { + margin-top: 10px; + border: 1px dashed rgba(155, 124, 255, 0.35); + background: rgba(8, 11, 27, 0.55); +} + +.alt-med-tool .panel-heading h2 { + font-size: 1.05rem; +} + +.alt-med-tool-regimen-select { + margin: 0 0 10px; + max-width: min(520px, 100%); +} + +.alt-med-tool-regimen-select select { + width: 100%; +} + +.alt-med-tool-hint { + margin: 8px 0 0; + max-width: 960px; +} + +.alt-med-tool-row { + display: flex; + flex-wrap: wrap; + gap: 12px; + align-items: flex-end; +} + +.alt-med-tool-field { + flex: 1 1 220px; + margin: 0; +} + +.alt-med-tool-actions { + display: flex; + flex-wrap: wrap; + gap: 8px; +} + +.alt-med-tool-results { + margin-top: 12px; +} + +.alt-med-tool-errors { + color: var(--warning); + font-size: 0.85rem; +} + +.alt-med-suggestions-scroll { + margin-top: 8px; + max-height: 17.5rem; + overflow-y: auto; + overflow-x: hidden; + padding-right: 4px; + border-radius: 12px; + border: 1px solid var(--line-soft); + background: rgba(5, 8, 20, 0.35); +} + +.alt-med-suggestion-list { + list-style: none; + margin: 0; + padding: 8px; + display: flex; + flex-direction: column; + gap: 6px; +} + +.alt-med-suggestion { + padding: 8px 10px; + border-radius: 10px; + border: 1px solid var(--line-soft); + background: rgba(13, 16, 35, 0.45); + flex-shrink: 0; +} + +.alt-med-ar { + margin-top: 4px; + font-size: 0.76rem; + color: var(--muted); + line-height: 1.35; + display: -webkit-box; + -webkit-box-orient: vertical; + -webkit-line-clamp: 2; + overflow: hidden; +} + +.alt-med-link { + display: inline-block; + margin-top: 6px; + font-size: 0.82rem; + color: var(--accent-2); +} + ::-webkit-scrollbar { width: 7px; height: 7px; diff --git a/docker/space/README.md b/docker/space/README.md index 095bf0f40380d50b5745b8e2719e84b2227dbe58..423e2a969f36e5d6a4fa3eea12b27ee47f022227 100644 --- a/docker/space/README.md +++ b/docker/space/README.md @@ -12,28 +12,46 @@ Never commit or paste Hugging Face tokens into chat or the repo. If a token was ```bash cd polyguard-rl - docker build -f Dockerfile.space -t polyguard-space . + docker build -t polyguard-space . ``` -3. Push the Space repo (HF expects `Dockerfile` at root). Either: +3. Push the Space repo. The root **`Dockerfile`** is the full demo (Vite UI + nginx + API + OpenEnv). Hugging Face uses it automatically when **Dockerfile path** is empty. If your Space was created earlier with a different Dockerfile, trigger **Factory reboot** after pushing so the new image builds. - - **Option A:** In the Space repo on Hub, set **Build → Dockerfile path** to `Dockerfile.space` if the UI allows, **or** copy/rename: `cp Dockerfile.space Dockerfile` in the branch you push. +4. Commit and push to the Space repository. HF builds the image on their builders (you do not need to `docker push` to Docker Hub for standard Spaces). - - **Option B:** Make this `polyguard-rl` folder the Space git root and add a symlink or duplicate `Dockerfile` pointing to the same content as `Dockerfile.space`. +## FDA panel / latest UI missing on the live Space -4. Commit and push to the Space repository. HF builds the image on their builders (you do not need to `docker push` to Docker Hub for standard Spaces). +Pushing code to GitHub alone does **not** refresh `huggingface.co/spaces/...` unless that Space is connected to the same repo **and** rebuilds from the branch that has your UI (for example `fda` vs `main`). This repo’s usual demo path is **upload via Hub API**: + +```bash +cd polyguard-rl +export HF_TOKEN="hf_..." # write token; never commit it +uv run python scripts/deploy_space_api.py --repo-id TheJackBright/polyguard-openenv +``` + +Wait for **Build** in the Space logs to finish, then use **Factory reboot** or a hard browser refresh if the page still looks old. **Dockerfile path** should be empty (default `Dockerfile`) or `Dockerfile` / `Dockerfile.space`. If the Space uses the **full monorepo** as its Git root, set Dockerfile path to the repo-root `Dockerfile` or to `polyguard-rl/Dockerfile`. ## Runtime - **Port:** Space sets `PORT` (default `7860`). Nginx listens on `PORT` and routes `/api/*` → API, `/ws` → OpenEnv WebSocket, `/` → built React app. -- **First boot:** If `checkpoints/active/grpo_adapter` is missing, `entrypoint.sh` runs `scripts/install_hf_active_bundle.py` (downloads the public bundle; slow on first start). +- **First boot:** If `checkpoints/active/grpo_adapter` is missing, `entrypoint.sh` runs `scripts/install_hf_active_bundle.py`. That pulls `TheJackBright/polyguard-openenv-training-full-artifacts` (slow, ~1.1 GB). - **CORS:** Set via `POLYGUARD_ALLOW_HF_SPACE_CORS=true` (default in the Space Dockerfile). -## Optional secrets +## If logs show `401` / `RepositoryNotFoundError` on startup + +The artifact **model repo is private, gated, or needs a license click** while anonymous downloads are blocked. The UI can still “work” using the **heuristic ranker** and public base models, but **your trained bundle is not installed**. + +**Fix (pick one):** + +1. **Space secret (recommended):** Space → **Settings** → **Secrets** → add **`HF_TOKEN`** = a [read token](https://huggingface.co/settings/tokens) that can access `polyguard-openenv-training-full-artifacts`. Restart the Space. +2. **Hub settings:** Make that model repo **public**, or ensure **gated** access allows the token you use in (1). +3. **Ignore:** Leave as-is if ranker-only behavior is enough for the demo. + +## Secrets -| Name | Use | -|-----------|-----| -| `HF_TOKEN` | Private model or artifact repo; `huggingface_hub` picks it up automatically when set in the Space environment. | +| Name | Use | +|------------|-----| +| `HF_TOKEN` | **Required** if the artifact repo is not anonymously readable; `huggingface_hub` reads it automatically. | ## Local smoke (same as Space) diff --git a/docs/DEMO_RECORDING_SCRIPT.md b/docs/DEMO_RECORDING_SCRIPT.md new file mode 100644 index 0000000000000000000000000000000000000000..96a764819c1271e02e364cffefa21378eec8f6b5 --- /dev/null +++ b/docs/DEMO_RECORDING_SCRIPT.md @@ -0,0 +1,493 @@ +# PolyGuard Space UI — demo recording script (shot-by-shot) + +Use this document while screen-recording the Hugging Face Space (or local Docker). Target length: **8–14 minutes** for a full pass, or **3–5 minutes** for a highlights reel. + +--- + +## Before you hit record + +1. **Open the Space** in a clean browser profile or incognito (fewer extensions → fewer glitches). +2. **Set resolution**: 1920×1080 or 1440×900; browser zoom **100%**. +3. **Fullscreen** the Space iframe or use HF “Open in new tab” so the URL bar shows the Space domain. +4. **Wait for cold start**: first load may download the model bundle (several minutes). The **Event Log** and **Model Truth** panel will tell you if the policy failed to load (heuristic fallback is still usable for env steps). +5. **Optional**: hide mouse cursor in OBS if you prefer; otherwise move slowly and pause **2 seconds** on each panel after major clicks. + +**Primary Space (product):** `https://huggingface.co/spaces/TheJackBright/polyguard-openenv` +Runtime: nginx fronts the **product API** (default `8200`) and **OpenEnv service** (`8100`); see `docker/space/entrypoint.sh`. + +--- + +## Where the model lives (Qwen and artifacts) + +This matters for what you say on camera. + +| Location | What it is | +| --- | --- | +| **On the Space container** | Working directory `/app` (see `entrypoint.sh`: `cd /app`). | +| **Downloaded bundle** | If `checkpoints/active/grpo_adapter/adapter_config.json` is missing at boot, `scripts/install_hf_active_bundle.py` pulls the **HF usable model bundle** into `checkpoints/active/`. | +| **Typical layout after install** | `checkpoints/active/active_model_manifest.json` — which artifact is active (often **GRPO adapter** on top of base). | +| **Weights** | `checkpoints/active/grpo_adapter/` (LoRA/PEFT), optionally `checkpoints/active/merged/` (full merged weights), `checkpoints/active/sft_adapter/`. | +| **Base model name** | Usually **`Qwen/Qwen2.5-0.5B-Instruct`** as the Transformers base for adapters (set via env e.g. `POLYGUARD_HF_MODEL`). | + +**What the UI proves:** the **Model Truth** panel calls **`GET /policy/model_status`** (product API). It shows `model_id` / `base_model`, `run_id`, `preferred_artifact` / `loaded_source`, and availability flags. Say on camera: *“This is live from the API, not hard-coded in the frontend.”* + +--- + +## UI map (what appears on screen) + +| Region | Purpose | +| --- | --- | +| **Hero** (“PolyGuard neural safety cockpit”) | Marketing copy + quick stats. | +| **Top bar** | **Agent Workbench** vs **Env Explorer**, **Task** dropdown, **Reset Episode**, **Q Tips**. | +| **Status chips** | “Live” / model line; in Env mode one chip reads **ws env** (WebSocket to OpenEnv). | +| **Model Truth** | Qwen / artifact / run / availability. | +| **Advanced strip** | Only if Task = **Advanced** — pick raw `difficulty` + `sub_environment`. | +| **Episode Overview** | Mode, task, difficulty, environment, step budget, last reward, patient id, **Patient Summary**, **Risk Delta**. | +| **Candidate Actions** | Legal moves: `candidate_id`, action type, target/replacement, estimated safety delta (or **Blocked**). | +| **Action Console** | Confidence, rationale, **Submit** vs **Run Agent** (Agent mode only for Run Agent). | +| **Reward Channels** | Bars for total + primary + component scores (see below). | +| **Current Medications** | Cards from observation. | +| **Action History / Warnings** | Step trace and env warnings. | +| **Decision / Explanation / Evidence** | **Agent mode only** (filled after API steps that return those fields). | +| **Event Log** | Human-readable trace of resets, steps, rewards, errors. | + +--- + +## Feature encyclopedia — every panel, branch, and agent + +Use this section as a **script appendix** or **judge handout**. It mirrors the React workbench in `app/ui/frontend/src/App.tsx`, the API in `app/api/`, and the orchestrator in `app/agents/orchestrator.py`. + +### A. How the Space is wired (under the hood) + +| Piece | Role | +| --- | --- | +| **Browser → nginx** | HF Space exposes one origin; nginx routes paths. | +| **Product API** | Vite uses `API_BASE` (default **`/api`**). FastAPI serves catalog, reset, step_candidate, orchestrate, model_status, reward_breakdown, etc. | +| **OpenEnv HTTP/WS** | `ENV_BASE` defaults to **same origin** on Spaces (not localhost). Web UI opens **`ws(s):///ws`** for Env Explorer. | +| **Two Python processes** | `entrypoint.sh` starts **uvicorn** for `app.env.fastapi_app` (env, port **8100**) and **uvicorn** for `app.api` (product API, port **8200**). Agent mode reset/step still use the **API’s** in-process `PolyGuardEnv`; Env mode uses the **separate** env service over WebSocket. | +| **Important** | Agent and Env UIs maintain **separate React state** (`agentObservation` vs `envObservation`). Toggling mode **clears the Event Log** and clears the inactive branch’s episode state so you always know which backend path you are exercising. | + +### B. Hero (“PolyGuard neural safety cockpit”) + +| Stat | Source | What to say on camera | +| --- | --- | --- | +| **Runtime** | `mode === "agent"` → “Agent Workbench”; else “Env Explorer”. | “This is which transport I am using right now.” | +| **Scenario** | Human label for current `taskId` from catalog presets or Advanced. | “Which curriculum preset is bound to difficulty + sub-environment.” | +| **Candidates** | `candidate_action_set.length` from the **active** observation. | “How many legal moves the env is offering after the last reset/step.” | +| **Reward** | Last scalar reward for the active branch (`null` → shown as `-`). | “Verifier scalar after the last step in this mode only.” | + +### C. Top bar — every control + +| Control | Behavior | +| --- | --- | +| **Agent Workbench** | Sets `mode` to `agent`. Clears env state, event log, error; clears agent panels if switching from env (see `handleModeChange`). | +| **Env Explorer** | Sets `mode` to `env`. Clears agent-specific observation/reward/decision/evidence. | +| **Task** `