E-Rong
/

til-26-ae-agent

E-Rong commited on about 15 hours ago

Commit

227d906

verified ·

1 Parent(s): 9bdda09

Add development workflow section to AGENTS.md

Files changed (1) hide show

AGENTS.md CHANGED Viewed

@@ -35,6 +35,20 @@
 ---
 ## How to Submit an HF Job (the only way that works)
 ```python

 ---
+## Development Workflow (follow exactly)
+1. **Write on `cpu-basic`** — code, docs, scripts, planning. Never touch GPU sandboxes for editing.
+2. **Smoke-test on GPU sandbox** (`t4-small` or `a10g-small`) — run the script for 5-10 minutes to verify it loads the env, runs training steps, and can push a checkpoint. **Stop the GPU sandbox immediately** after pass or fail. Never leave it idle.
+3. **If smoke test fails** — look up Hugging Face documentation (`explore_hf_docs`, `fetch_hf_docs`) or relevant docs to diagnose the issue. Iterate based on what you learn. Go back to step 1.
+4. **If smoke test passes** — update `docs/ae.md` with current project status, update `AGENTS.md` with anything new you learned. Push both to the Hub before proceeding.
+5. **Submit the real Job** (`a10g-small`, `a10g-large`, etc.). Immediately check `hf_jobs logs` to confirm it starts successfully. **Poll the job every 5 minutes** until the user interrupts you. During polling downtime, work on docs or scripts for upcoming phases, but keep checking the job.
+---
 ## How to Submit an HF Job (the only way that works)
 ```python