Spaces:
Running
Running
File size: 1,254 Bytes
877add7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | # Architecture
POLYGUARD-RL uses an OpenEnv-first monorepo architecture with six layers:
1. Data ingestion and retrieval index.
2. Predictive safety, graph, tabular risk, and dosing models.
3. Multi-agent orchestration graph.
4. Hierarchical RL training stack.
5. Safety governance and anti-cheat controls.
6. FastAPI, OpenEnv, and React deployment surfaces.
## Data Flow
```text
raw/local knowledge -> processed tables -> scenarios -> SFT/GRPO corpora
|
v
PolyGuardEnv reset/step/state
|
v
agent stack -> verifier reward -> training/evaluation reports
|
v
docs/results + README + HF Space
```
## Runtime Boundaries
- Environment code owns state transition, legality, rewards, anti-cheat, and traces.
- Agent code owns candidate interpretation, routing, planning, critique, and explanation.
- Training code owns SFT, GRPO, reward logging, adapters, and registry metadata.
- Evaluation code owns baselines, perturbations, reports, and plots.
- Deployment code owns OpenEnv validation and HF Space push.
|