Spaces:
Sleeping
Sleeping
Don Rishabh Claude Opus 4.7 (1M context) commited on
Commit ·
96d773b
1
Parent(s): e51b5ef
training/TRAINING.md: add "Quick start — just run the .sh" subsection
Browse filesFor users who want to reproduce the run without reading 8 sections of
detail. Two setup commands (hf auth login + PUSH_TO_HUB override),
four one-liner dispatches, and the common gotchas (Llama gating, HF
Jobs quota, whoami rate-limit, REPO_URL/TARGET_MODEL overrides).
The deeper §1-§8 sections remain for users who want to customize
hyperparameters or build artifacts locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- training/TRAINING.md +33 -0
training/TRAINING.md
CHANGED
|
@@ -35,6 +35,39 @@ Each `.sh` is a thin wrapper around the [`hf jobs`](https://huggingface.co/docs/
|
|
| 35 |
- Don't run on your laptop. No local GPU required.
|
| 36 |
- The CSV-builder (`build_before_after_csv.py`), plot-renderer (`make_plots.py`), and Trackio-replayer (`replay_to_trackio.py`) **don't** have `.sh` wrappers — they're cheap CPU-only scripts you run locally after the GPU jobs finish.
|
| 37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
---
|
| 39 |
|
| 40 |
## 0. Prerequisites
|
|
|
|
| 35 |
- Don't run on your laptop. No local GPU required.
|
| 36 |
- The CSV-builder (`build_before_after_csv.py`), plot-renderer (`make_plots.py`), and Trackio-replayer (`replay_to_trackio.py`) **don't** have `.sh` wrappers — they're cheap CPU-only scripts you run locally after the GPU jobs finish.
|
| 37 |
|
| 38 |
+
### Quick start — just run the .sh
|
| 39 |
+
|
| 40 |
+
For most users, two setup commands and you're dispatching real training jobs:
|
| 41 |
+
|
| 42 |
+
```bash
|
| 43 |
+
# 1. HF write token (free account; takes 30 seconds)
|
| 44 |
+
hf auth login # paste a write token
|
| 45 |
+
|
| 46 |
+
# 2. Override the default PUSH_TO_HUB so artifacts go to your namespace
|
| 47 |
+
export PUSH_TO_HUB=your-username/your-adapter-repo
|
| 48 |
+
```
|
| 49 |
+
|
| 50 |
+
Then any of:
|
| 51 |
+
|
| 52 |
+
```bash
|
| 53 |
+
bash training/hf_job_profile.sh # ≈30m on L4
|
| 54 |
+
bash training/hf_job_train.sh # ≈3h on L40S
|
| 55 |
+
bash training/hf_job_eval.sh both # 2 × ≈15m on L40S
|
| 56 |
+
SFT_ADAPTER=$PUSH_TO_HUB \
|
| 57 |
+
bash training/hf_job_train_multistep.sh # ≈3.5h on L40S
|
| 58 |
+
```
|
| 59 |
+
|
| 60 |
+
Each call returns a job ID immediately (`hf jobs run --detach`). Monitor with `hf jobs ps -a` and `hf jobs logs <job-id> --follow`.
|
| 61 |
+
|
| 62 |
+
**Common gotchas before you click run:**
|
| 63 |
+
- **Llama-3.2 is gated.** Accept the license at <https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct> first; the job dies on model-download otherwise. (Approval is usually instant.)
|
| 64 |
+
- **HF Jobs quota.** Free accounts get a small monthly GPU budget; a hero training run burns several hours of it. The script returns a quota error within 30 seconds if you're out — nothing has actually started.
|
| 65 |
+
- **`whoami` rate-limit.** Dispatching 4–5 jobs in rapid succession will lock you out for 5–25 min. Pace dispatches; don't poll-loop.
|
| 66 |
+
- **Default `REPO_URL`** clones `https://huggingface.co/spaces/rishabh16196/prompt_golf_env`. If you've forked the env, override: `REPO_URL=https://huggingface.co/spaces/your-name/your-fork bash ...`.
|
| 67 |
+
- **Default `TARGET_MODEL`** is `meta-llama/Llama-3.2-3B-Instruct`. Override per call: `TARGET_MODEL=microsoft/Phi-3-mini-4k-instruct bash training/hf_job_train.sh`.
|
| 68 |
+
|
| 69 |
+
That's the whole "user wants to reproduce this" path. The §1–§8 sections below go deeper if you want to understand individual steps, customize hyperparameters, or build the demo CSV / plots / Trackio replay locally after the jobs finish.
|
| 70 |
+
|
| 71 |
---
|
| 72 |
|
| 73 |
## 0. Prerequisites
|