Spaces:

TheJackBright
/

polyguard-openenv-workbench

Sleeping

Deploy GitHub root master to Space

c296d62 11 days ago

1.07 kB

	# Safety

	PolyGuard is safety-first: the model is never allowed to apply an arbitrary free-text medication action directly to state.

	## Guardrails

	- Strict `PolyGuardAction` schema.
	- Candidate IDs generated by the environment.
	- Legality verifier before state transition.
	- Critic veto before execution.
	- Anti-cheat checks for reward hacking.
	- Timeout and step-budget termination.
	- Uncertainty-based abstention and review escalation.
	- Failure reasons surfaced in traces and API responses.

	## Clinical Trust Signals

	The environment reports:

	- legal/illegal action status
	- safety violations
	- DDI risk deltas
	- medication burden changes
	- uncertainty and abstention indicators
	- explanation grounding score
	- invalid action count
	- anti-cheat reasons

	This makes reward improvements auditable instead of relying on a single opaque scalar.

	## Explicit Non-Goals

	PolyGuard does not produce clinical orders, patient-specific prescriptions, or medical advice. It is an RL environment and demonstration system for training/evaluating medication-safety agents.