E-Rong commited on
Commit
227d906
·
verified ·
1 Parent(s): 9bdda09

Add development workflow section to AGENTS.md

Browse files
Files changed (1) hide show
  1. AGENTS.md +14 -0
AGENTS.md CHANGED
@@ -35,6 +35,20 @@
35
 
36
  ---
37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  ## How to Submit an HF Job (the only way that works)
39
 
40
  ```python
 
35
 
36
  ---
37
 
38
+ ## Development Workflow (follow exactly)
39
+
40
+ 1. **Write on `cpu-basic`** — code, docs, scripts, planning. Never touch GPU sandboxes for editing.
41
+
42
+ 2. **Smoke-test on GPU sandbox** (`t4-small` or `a10g-small`) — run the script for 5-10 minutes to verify it loads the env, runs training steps, and can push a checkpoint. **Stop the GPU sandbox immediately** after pass or fail. Never leave it idle.
43
+
44
+ 3. **If smoke test fails** — look up Hugging Face documentation (`explore_hf_docs`, `fetch_hf_docs`) or relevant docs to diagnose the issue. Iterate based on what you learn. Go back to step 1.
45
+
46
+ 4. **If smoke test passes** — update `docs/ae.md` with current project status, update `AGENTS.md` with anything new you learned. Push both to the Hub before proceeding.
47
+
48
+ 5. **Submit the real Job** (`a10g-small`, `a10g-large`, etc.). Immediately check `hf_jobs logs` to confirm it starts successfully. **Poll the job every 5 minutes** until the user interrupts you. During polling downtime, work on docs or scripts for upcoming phases, but keep checking the job.
49
+
50
+ ---
51
+
52
  ## How to Submit an HF Job (the only way that works)
53
 
54
  ```python