Spaces:
Sleeping
Sleeping
File size: 3,567 Bytes
562f58d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | # Architecture
## System Overview
```
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β HF Space / Docker Container β
β β
β ββββββββββββββββ ββββββββββββββββββββββββββββββββββββ β
β β Gradio UI β β FastAPI Server β β
β β (port 7860) β β POST /reset GET /state β β
β β β β POST /step GET /health β β
β ββββββββ¬ββββββββ ββββββββββββββββ¬ββββββββββββββββββββ β
β β β β
β ββββββββββββ¬βββββββββββββββββ β
β β β
β ββββββββββββΌβββββββββββββββ β
β β InvoiceExceptionEnv β β
β β reset() step() state() β β
β β grade() β β
β ββββββββββββ¬βββββββββββββββ β
β β β
β ββββββββββββΌβββββββββββββββ β
β β Task Registry β β
β β task1_price_variance β β
β β task2_duplicate_tax β β
β β task3_compound_fraud β β
β βββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
## Key Design Decisions
### FastAPI + Gradio in same process
HF Spaces requires a single port (7860). Gradio is mounted on FastAPI using
`gr.mount_gradio_app()` so both the validator API and the interactive UI
share the same process and port.
### Pydantic v2 for all models
Required by the OpenEnv spec. Every field is typed. No `Any` fields without
explicit documentation of why.
### EpisodeData vs EnvironmentState
- **EpisodeData** is mutable internal state tracking what the agent has done
- **EnvironmentState** is the immutable snapshot returned to the agent
- Documents (PO, Invoice, GRN) are rebuilt from task factories each time,
ensuring they are never accidentally mutated
### Separate task classes
Each task is a self-contained class with its own documents, simulators, and
grader. This makes it trivial to add new tasks β just implement BaseTask and
register in TASK_REGISTRY.
### Deterministic simulation
No randomness in simulators or graders. Same seed + same actions = same scores.
The only randomness is in `action_space_sample()` for baseline agents.
|