Don Rishabh Claude Opus 4.7 (1M context) commited on
Commit
96d773b
·
1 Parent(s): e51b5ef

training/TRAINING.md: add "Quick start — just run the .sh" subsection

Browse files

For users who want to reproduce the run without reading 8 sections of
detail. Two setup commands (hf auth login + PUSH_TO_HUB override),
four one-liner dispatches, and the common gotchas (Llama gating, HF
Jobs quota, whoami rate-limit, REPO_URL/TARGET_MODEL overrides).
The deeper §1-§8 sections remain for users who want to customize
hyperparameters or build artifacts locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (1) hide show
  1. training/TRAINING.md +33 -0
training/TRAINING.md CHANGED
@@ -35,6 +35,39 @@ Each `.sh` is a thin wrapper around the [`hf jobs`](https://huggingface.co/docs/
35
  - Don't run on your laptop. No local GPU required.
36
  - The CSV-builder (`build_before_after_csv.py`), plot-renderer (`make_plots.py`), and Trackio-replayer (`replay_to_trackio.py`) **don't** have `.sh` wrappers — they're cheap CPU-only scripts you run locally after the GPU jobs finish.
37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  ---
39
 
40
  ## 0. Prerequisites
 
35
  - Don't run on your laptop. No local GPU required.
36
  - The CSV-builder (`build_before_after_csv.py`), plot-renderer (`make_plots.py`), and Trackio-replayer (`replay_to_trackio.py`) **don't** have `.sh` wrappers — they're cheap CPU-only scripts you run locally after the GPU jobs finish.
37
 
38
+ ### Quick start — just run the .sh
39
+
40
+ For most users, two setup commands and you're dispatching real training jobs:
41
+
42
+ ```bash
43
+ # 1. HF write token (free account; takes 30 seconds)
44
+ hf auth login # paste a write token
45
+
46
+ # 2. Override the default PUSH_TO_HUB so artifacts go to your namespace
47
+ export PUSH_TO_HUB=your-username/your-adapter-repo
48
+ ```
49
+
50
+ Then any of:
51
+
52
+ ```bash
53
+ bash training/hf_job_profile.sh # ≈30m on L4
54
+ bash training/hf_job_train.sh # ≈3h on L40S
55
+ bash training/hf_job_eval.sh both # 2 × ≈15m on L40S
56
+ SFT_ADAPTER=$PUSH_TO_HUB \
57
+ bash training/hf_job_train_multistep.sh # ≈3.5h on L40S
58
+ ```
59
+
60
+ Each call returns a job ID immediately (`hf jobs run --detach`). Monitor with `hf jobs ps -a` and `hf jobs logs <job-id> --follow`.
61
+
62
+ **Common gotchas before you click run:**
63
+ - **Llama-3.2 is gated.** Accept the license at <https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct> first; the job dies on model-download otherwise. (Approval is usually instant.)
64
+ - **HF Jobs quota.** Free accounts get a small monthly GPU budget; a hero training run burns several hours of it. The script returns a quota error within 30 seconds if you're out — nothing has actually started.
65
+ - **`whoami` rate-limit.** Dispatching 4–5 jobs in rapid succession will lock you out for 5–25 min. Pace dispatches; don't poll-loop.
66
+ - **Default `REPO_URL`** clones `https://huggingface.co/spaces/rishabh16196/prompt_golf_env`. If you've forked the env, override: `REPO_URL=https://huggingface.co/spaces/your-name/your-fork bash ...`.
67
+ - **Default `TARGET_MODEL`** is `meta-llama/Llama-3.2-3B-Instruct`. Override per call: `TARGET_MODEL=microsoft/Phi-3-mini-4k-instruct bash training/hf_job_train.sh`.
68
+
69
+ That's the whole "user wants to reproduce this" path. The §1–§8 sections below go deeper if you want to understand individual steps, customize hyperparameters, or build the demo CSV / plots / Trackio replay locally after the jobs finish.
70
+
71
  ---
72
 
73
  ## 0. Prerequisites