polyguard-openenv / docs /architecture.md
TheJackBright's picture
Deploy PolyGuard OpenEnv Space
877add7 verified

Architecture

POLYGUARD-RL uses an OpenEnv-first monorepo architecture with six layers:

  1. Data ingestion and retrieval index.
  2. Predictive safety, graph, tabular risk, and dosing models.
  3. Multi-agent orchestration graph.
  4. Hierarchical RL training stack.
  5. Safety governance and anti-cheat controls.
  6. FastAPI, OpenEnv, and React deployment surfaces.

Data Flow

raw/local knowledge -> processed tables -> scenarios -> SFT/GRPO corpora
                                |
                                v
                       PolyGuardEnv reset/step/state
                                |
                                v
           agent stack -> verifier reward -> training/evaluation reports
                                |
                                v
                    docs/results + README + HF Space

Runtime Boundaries

  • Environment code owns state transition, legality, rewards, anti-cheat, and traces.
  • Agent code owns candidate interpretation, routing, planning, critique, and explanation.
  • Training code owns SFT, GRPO, reward logging, adapters, and registry metadata.
  • Evaluation code owns baselines, perturbations, reports, and plots.
  • Deployment code owns OpenEnv validation and HF Space push.