# PolyGuard Space UI — demo recording script (shot-by-shot) Use this document while screen-recording the Hugging Face Space (or local Docker). Target length: **8–14 minutes** for a full pass, or **3–5 minutes** for a highlights reel. --- ## Before you hit record 1. **Open the Space** in a clean browser profile or incognito (fewer extensions → fewer glitches). 2. **Set resolution**: 1920×1080 or 1440×900; browser zoom **100%**. 3. **Fullscreen** the Space iframe or use HF “Open in new tab” so the URL bar shows the Space domain. 4. **Wait for cold start**: first load may download the model bundle (several minutes). The **Event Log** and **Model Truth** panel will tell you if the policy failed to load (heuristic fallback is still usable for env steps). 5. **Optional**: hide mouse cursor in OBS if you prefer; otherwise move slowly and pause **2 seconds** on each panel after major clicks. **Primary Space (product):** `https://huggingface.co/spaces/TheJackBright/polyguard-openenv-workbench` Runtime: nginx fronts the **product API** (default `8200`) and **OpenEnv service** (`8100`); see `docker/space/entrypoint.sh`. --- ## Where the model lives (Qwen and artifacts) This matters for what you say on camera. | Location | What it is | | --- | --- | | **On the Space container** | Working directory `/app` (see `entrypoint.sh`: `cd /app`). | | **Downloaded bundle** | If `checkpoints/active/grpo_adapter/adapter_config.json` is missing at boot, `scripts/install_hf_active_bundle.py` pulls the **HF usable model bundle** into `checkpoints/active/`. | | **Typical layout after install** | `checkpoints/active/active_model_manifest.json` — which artifact is active (often **GRPO adapter** on top of base). | | **Weights** | `checkpoints/active/grpo_adapter/` (LoRA/PEFT), optionally `checkpoints/active/merged/` (full merged weights), `checkpoints/active/sft_adapter/`. | | **Base model name** | Usually **`Qwen/Qwen2.5-0.5B-Instruct`** as the Transformers base for adapters (set via env e.g. `POLYGUARD_HF_MODEL`). | **What the UI proves:** the **Model Truth** panel calls **`GET /policy/model_status`** (product API). It shows `model_id` / `base_model`, `run_id`, `preferred_artifact` / `loaded_source`, and availability flags. Say on camera: *“This is live from the API, not hard-coded in the frontend.”* --- ## UI map (what appears on screen) | Region | Purpose | | --- | --- | | **Hero** (“PolyGuard neural safety cockpit”) | Marketing copy + quick stats. | | **Top bar** | **Agent Workbench** vs **Env Explorer**, **Task** dropdown, **Reset Episode**, **Q Tips**. | | **Status chips** | “Live” / model line; in Env mode one chip reads **ws env** (WebSocket to OpenEnv). | | **Model Truth** | Qwen / artifact / run / availability. | | **Advanced strip** | Only if Task = **Advanced** — pick raw `difficulty` + `sub_environment`. | | **Episode Overview** | Mode, task, difficulty, environment, step budget, last reward, patient id, **Patient Summary**, **Risk Delta**. | | **Candidate Actions** | Legal moves: `candidate_id`, action type, target/replacement, estimated safety delta (or **Blocked**). | | **Action Console** | Confidence, rationale, **Submit** vs **Run Agent** (Agent mode only for Run Agent). | | **Reward Channels** | Bars for total + primary + component scores (see below). | | **Current Medications** | Cards from observation. | | **Action History / Warnings** | Step trace and env warnings. | | **Decision / Explanation / Evidence** | **Agent mode only** (filled after API steps that return those fields). | | **Event Log** | Human-readable trace of resets, steps, rewards, errors. | --- ## Feature encyclopedia — every panel, branch, and agent Use this section as a **script appendix** or **judge handout**. It mirrors the React workbench in `app/ui/frontend/src/App.tsx`, the API in `app/api/`, and the orchestrator in `app/agents/orchestrator.py`. ### A. How the Space is wired (under the hood) | Piece | Role | | --- | --- | | **Browser → nginx** | HF Space exposes one origin; nginx routes paths. | | **Product API** | Vite uses `API_BASE` (default **`/api`**). FastAPI serves catalog, reset, step_candidate, orchestrate, model_status, reward_breakdown, etc. | | **OpenEnv HTTP/WS** | `ENV_BASE` defaults to **same origin** on Spaces (not localhost). Web UI opens **`ws(s):///ws`** for Env Explorer. | | **Two Python processes** | `entrypoint.sh` starts **uvicorn** for `app.env.fastapi_app` (env, port **8100**) and **uvicorn** for `app.api` (product API, port **8200**). Agent mode reset/step still use the **API’s** in-process `PolyGuardEnv`; Env mode uses the **separate** env service over WebSocket. | | **Important** | Agent and Env UIs maintain **separate React state** (`agentObservation` vs `envObservation`). Toggling mode **clears the Event Log** and clears the inactive branch’s episode state so you always know which backend path you are exercising. | ### B. Hero (“PolyGuard neural safety cockpit”) | Stat | Source | What to say on camera | | --- | --- | --- | | **Runtime** | `mode === "agent"` → “Agent Workbench”; else “Env Explorer”. | “This is which transport I am using right now.” | | **Scenario** | Human label for current `taskId` from catalog presets or Advanced. | “Which curriculum preset is bound to difficulty + sub-environment.” | | **Candidates** | `candidate_action_set.length` from the **active** observation. | “How many legal moves the env is offering after the last reset/step.” | | **Reward** | Last scalar reward for the active branch (`null` → shown as `-`). | “Verifier scalar after the last step in this mode only.” | ### C. Top bar — every control | Control | Behavior | | --- | --- | | **Agent Workbench** | Sets `mode` to `agent`. Clears env state, event log, error; clears agent panels if switching from env (see `handleModeChange`). | | **Env Explorer** | Sets `mode` to `env`. Clears agent-specific observation/reward/decision/evidence. | | **Task** `