| # Safety | |
| PolyGuard is safety-first: the model is never allowed to apply an arbitrary free-text medication action directly to state. | |
| ## Guardrails | |
| - Strict `PolyGuardAction` schema. | |
| - Candidate IDs generated by the environment. | |
| - Legality verifier before state transition. | |
| - Critic veto before execution. | |
| - Anti-cheat checks for reward hacking. | |
| - Timeout and step-budget termination. | |
| - Uncertainty-based abstention and review escalation. | |
| - Failure reasons surfaced in traces and API responses. | |
| ## Clinical Trust Signals | |
| The environment reports: | |
| - legal/illegal action status | |
| - safety violations | |
| - DDI risk deltas | |
| - medication burden changes | |
| - uncertainty and abstention indicators | |
| - explanation grounding score | |
| - invalid action count | |
| - anti-cheat reasons | |
| This makes reward improvements auditable instead of relying on a single opaque scalar. | |
| ## Explicit Non-Goals | |
| PolyGuard does not produce clinical orders, patient-specific prescriptions, or medical advice. It is an RL environment and demonstration system for training/evaluating medication-safety agents. | |