# Safety

PolyGuard is safety-first: the model is never allowed to apply an arbitrary free-text medication action directly to state.

## Guardrails

- Strict `PolyGuardAction` schema.
- Candidate IDs generated by the environment.
- Legality verifier before state transition.
- Critic veto before execution.
- Anti-cheat checks for reward hacking.
- Timeout and step-budget termination.
- Uncertainty-based abstention and review escalation.
- Failure reasons surfaced in traces and API responses.

## Clinical Trust Signals

The environment reports:

- legal/illegal action status
- safety violations
- DDI risk deltas
- medication burden changes
- uncertainty and abstention indicators
- explanation grounding score
- invalid action count
- anti-cheat reasons

This makes reward improvements auditable instead of relying on a single opaque scalar.

## Explicit Non-Goals

PolyGuard does not produce clinical orders, patient-specific prescriptions, or medical advice. It is an RL environment and demonstration system for training/evaluating medication-safety agents.