Spaces:
Running
Running
File size: 6,599 Bytes
3eae4cc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 | ---
title: Gov Workflow OpenEnv
sdk: docker
app_port: 7860
pinned: false
---
# Gov Workflow OpenEnv
## Quick Links
- Hugging Face Space URL (Dummy, update later): [https://huggingface.co/spaces/your-username/your-space-name](https://huggingface.co/spaces/your-username/your-space-name)
This placeholder will be replaced with the final deployed demo link.
- Blog path in codebase: `OPENENV_RL/Blog.md`
Project write-up and narrative documentation for design choices and outcomes.
- Notebook path: `OPENENV_RL/GovWorkflow_RL_ENV.ipynb`
Main OpenEnv RL government workflow notebook used as the judge-facing criteria book. It contains the practical judging context, environment setup, and the full end-to-end flow in one place.
- Notebook Colab URL: [https://colab.research.google.com/drive/1ssTnxKoU1nOfSNA3nOeiNM8S4fKFpkby?usp=sharing](https://colab.research.google.com/drive/1ssTnxKoU1nOfSNA3nOeiNM8S4fKFpkby?usp=sharing)
Cloud version of the same notebook so judges can run and review the complete workflow without local setup.
- GRPO Phase 1 training link: [https://colab.research.google.com/drive/1ND_DZ6xcT2JuH7uGB2AYbiZ1dcHKFfIw?usp=sharing](https://colab.research.google.com/drive/1ND_DZ6xcT2JuH7uGB2AYbiZ1dcHKFfIw?usp=sharing)
First-stage GRPO training run where the LLM agent starts learning policy behavior inside the RL environment.
- GRPO Phase 2 training link: [https://colab.research.google.com/drive/1ofxEADct_gTX5DGhcnk8lW6p31gFCIFV?usp=sharing](https://colab.research.google.com/drive/1ofxEADct_gTX5DGhcnk8lW6p31gFCIFV?usp=sharing)
Second-stage GRPO continuation where the same LLM agent is further trained and refined on the RL environment.
- PPO Phase 1 training (local): `rl/train_ppo.py`
Phase 1 PPO baseline training was executed on the local system to establish the RL algorithm baseline before phase-2 progression.
- PPO Phase 2 training link: [https://colab.research.google.com/drive/1RVXQs-QAuXLBw0YXJtN4cbEootCTfHO7?usp=sharing](https://colab.research.google.com/drive/1RVXQs-QAuXLBw0YXJtN4cbEootCTfHO7?usp=sharing)
PPO phase 2 training notebook where the RL algorithm is further trained on the same environment for improved policy performance.
Gov Workflow OpenEnv is a FastAPI-first simulation environment for public service workflow operations.
It models queue prioritization, officer allocation, missing-document recovery, escalation usage, and fairness-aware SLA management across government services.
This repository is productionized for:
- local development (FastAPI + Vite)
- Docker runtime
- Hugging Face Spaces (Docker SDK)
## Current Main-Branch Status
This README is aligned to the current `main` branch code paths, including:
- `app.main:app` as primary server runtime
- React UI served at `/ui` from built Vite assets when available
- OpenEnv contract endpoints (`/reset`, `/step`, `/state`, `/grade`)
- frontend API aliases (`/api/*`) and versioned aliases (`/api/v1/*`)
- training story endpoints (`/training/*`)
- simulation, RL, persistence, compliance, and history endpoints
## End-to-End Architecture
```mermaid
flowchart LR
UI["React UI"] --> API["FastAPI app.main"]
API --> ENV["GovWorkflowEnv app/env.py"]
API --> SIM["Simulation runtime app/simulator.py"]
API --> RL["RL train/eval rl/*"]
API --> STORE["PersistenceStore SQLite + filesystem"]
API --> STORY["Training Story router /training/*"]
API --> OPENENV["Optional OpenEnv adapter /openenv/*"]
```
## Core Runtime Components
- API server: `app/main.py`
- Environment kernel: `app/env.py`
- Typed models: `app/models.py`
- Task registry: `app/tasks.py`
- Reward shaping: `app/reward.py`
- Deterministic graders: `app/graders.py`
- Simulation runtime: `app/simulator.py`
- Training jobs manager: `app/training_jobs.py`
- Persistence layer: `app/persistence.py`
- Transport gateway: `app/api_gateway.py`
- React frontend: `frontend/react`
## Task Set (Current Runtime)
Configured in `app/tasks.py`:
- `district_backlog_easy`
- `mixed_urgency_medium`
- `cross_department_hard`
- `district_backlog_easy_extreme`
Benchmark list used by APIs:
- `district_backlog_easy`
- `mixed_urgency_medium`
- `cross_department_hard`
## Service Coverage
`ServiceType` includes:
- `passport`
- `driving_license`
- `aadhaar_card`
- `gst_registration`
- `income_certificate`
- `caste_certificate`
- `birth_certificate`
- `land_registration`
Medium and hard tasks currently run with:
- `income_certificate`
- `land_registration`
- `passport`
- `driving_license`
- `aadhaar_card`
## Local Development
### Prerequisites
- Python 3.11+
- Node 20+
- Docker
### Install dependencies
```bash
pip install -r requirements.txt
pip install -r requirements_rl.txt
pip install pytest pytest-asyncio
npm --prefix frontend/react install
```
### Configure environment
```bash
copy .env.example .env
```
Populate as needed:
- `API_BASE_URL`
- `MODEL_NAME`
- `HF_TOKEN` or `OPENAI_API_KEY`/`API_KEY`
- optional NVIDIA keys (`NVIDIA_API_KEY`, `NVIDIA_API_KEY_2`)
- storage settings (`STORAGE_ENABLED`, `OPENENV_DATA_DIR`)
### Run backend
```bash
python scripts/run_local.py --host 127.0.0.1 --port 7860 --reload
```
### Run frontend
```bash
npm --prefix frontend/react run dev
```
Open:
- UI: `http://127.0.0.1:5173/ui`
- API docs: `http://127.0.0.1:7860/docs`
## Repository Layout
```text
app/
main.py FastAPI app + API routing + compatibility aliases
env.py GovWorkflowEnv kernel
models.py Typed Pydantic contracts
tasks.py Runtime task registry
reward.py Reward shaping
graders.py Deterministic graders
simulator.py Simulation runtime and live sessions
training_jobs.py Background RL training manager
persistence.py SQLite/filesystem persistence
api_gateway.py direct/http/auto environment transport layer
story_router.py training story endpoints
rl/
gov_workflow_env.py Gym adapter
train_ppo.py PPO phase training entrypoint
evaluate.py Checkpoint evaluator
feature_builder.py RL feature engineering
action_mask.py Action mask logic
frontend/react/
src/ React modules/components/api hooks
scripts/
run_local.py Local FastAPI launcher
convert_grpo_csv.py Training CSV to JSON converter for story endpoints
openenv.yaml OpenEnv manifest metadata
baseline_openai.py Baseline and LLM runner
inference.py Submission-style inference runner
Dockerfile Docker image definition
```
## License
BSD-3-Clause
|