File size: 1,585 Bytes
877add7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# Environment Design

`PolyGuardEnv` is a deterministic, seeded OpenEnv-style simulation for medication action selection under partial observability.

## State

The state tracks patient demographics, medications, labs, vitals, comorbidities, specialist conflicts, action history, cumulative reward, difficulty, sub-environment, burden score, risky-pair summary, and unresolved safety conflicts.

## Observation

The observation exposes only the agent-facing view:

- patient summary
- medication table
- comorbidities
- organ function and labs/vitals
- graph safety summary
- burden summary
- precision dosing flags
- unresolved conflicts
- candidate action set
- step budget
- action history
- warning summary
- abstention indicators

## Actions

Actions are constrained by `PolyGuardAction` and generated candidate IDs. The agent can keep a regimen, stop a drug, substitute within class, recommend alternatives, adjust dose bucket, initiate/continue taper, hold dose, order monitoring, fetch evidence, decompose a new drug, or request specialist/pharmacist review.

## Episode End Conditions

Episodes terminate on exploit detection, exhausted step budget, repeated invalid actions, justified review escalation, safety-veto threshold, patient destabilization, safe resolution, per-step timeout, or episode wall-clock timeout.

## OpenEnv Surface

The runtime exposes `/reset`, `/step`, `/state`, `/metadata`, `/schema`, `/mcp`, `/health`, `/ws`, and `/env/*` compatibility endpoints. `openenv validate .` validates packaging; `openenv validate --url ...` validates a running server.