Add Sandbox Policy section per user mandate
Browse files
AGENTS.md
CHANGED
|
@@ -139,6 +139,21 @@ env = ActionMasker(Monitor(base_env), ...) # DON'T DO THIS
|
|
| 139 |
|
| 140 |
---
|
| 141 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 142 |
## Repo File Guide
|
| 143 |
|
| 144 |
| File | What It Is |
|
|
|
|
| 139 |
|
| 140 |
---
|
| 141 |
|
| 142 |
+
## Sandbox Policy (User Mandate)
|
| 143 |
+
|
| 144 |
+
> **From this point forward, the user has mandated:**
|
| 145 |
+
|
| 146 |
+
1. **Start `cpu-basic` sandbox** at the beginning of every session
|
| 147 |
+
2. **Use `cpu-basic` for**: context, writing code, writing docs, editing files, planning
|
| 148 |
+
3. **Only switch to GPU sandbox** (`t4-small` or `a10g-small`) when performing **smoke tests** for training scripts
|
| 149 |
+
4. **Stop GPU sandbox IMMEDIATELY** after the smoke test completes
|
| 150 |
+
5. **Training tasks ONLY as HF Jobs** — never leave a training process running in a sandbox
|
| 151 |
+
6. **Never leave a GPU sandbox running idle** — this wastes money
|
| 152 |
+
|
| 153 |
+
**Why this matters**: A GPU sandbox at $1/hr running empty for 3 hours = $3 wasted for nothing. An HF Job at the same $1/hr actually trains for every billed minute.
|
| 154 |
+
|
| 155 |
+
---
|
| 156 |
+
|
| 157 |
## Repo File Guide
|
| 158 |
|
| 159 |
| File | What It Is |
|