Spaces:

ronitraj
/

QuantumScribe

Sleeping

App Files Files Community

ronitraj commited on 12 days ago

Commit

2d520b3

verified ·

1 Parent(s): 0139454

Upload notebooks/colab_train.ipynb with huggingface_hub

Browse files

Files changed (1) hide show

notebooks/colab_train.ipynb +284 -0

notebooks/colab_train.ipynb ADDED Viewed

	@@ -0,0 +1,284 @@

+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Qubit-Medic - end-to-end Colab notebook\n",
+    "\n",
+    "Runs SFT warm-up + GRPO RL on a single Colab T4. Total wall-clock: ~24 hours\n",
+    "(SFT ~30 min, GRPO ~22 hours, eval ~30 min). The notebook is structured so\n",
+    "every cell is idempotent and re-runnable.\n",
+    "\n",
+    "**W&B integration is on by default.** Every stage (format-test, SFT, GRPO,\n",
+    "eval) logs to the same W&B project (`qubit-medic`) and shares a `--wandb-group`\n",
+    "so the runs appear together in the dashboard. Set `WANDB_DISABLED=1` if you\n",
+    "want to skip W&B entirely."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 1. Clone the repo and install"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%cd /content\n",
+    "!git clone https://github.com/qubit-medic/qubit-medic.git || (cd qubit-medic && git pull)\n",
+    "%cd qubit-medic\n",
+    "!pip install -q -r requirements.txt\n",
+    "!pip install -q -r requirements-train.txt\n",
+    "!pip install -q --no-deps unsloth"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 2. Configure W&B\n",
+    "\n",
+    "Paste your API key from <https://wandb.ai/authorize>. The `EXPERIMENT_GROUP`\n",
+    "below is what bundles the format-test, SFT, GRPO, and eval runs together\n",
+    "on the dashboard - bump it for each new experiment."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os, datetime\n",
+    "EXPERIMENT_GROUP = f\"colab-{datetime.datetime.utcnow().strftime('%Y%m%d-%H%M')}\"\n",
+    "os.environ['WANDB_PROJECT'] = 'qubit-medic'\n",
+    "# os.environ['WANDB_ENTITY'] = 'your-team'        # uncomment if you use a team\n",
+    "# os.environ['WANDB_DISABLED'] = '1'              # uncomment to skip W&B\n",
+    "print('experiment group:', EXPERIMENT_GROUP)\n",
+    "!wandb login"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 3. Validate the environment\n",
+    "\n",
+    "All five gates must pass before going further. (No W&B logging here - this\n",
+    "is a static check.)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!python -m scripts.validate_env"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 4. Section 1.3 - format-test (existential go/no-go)\n",
+    "\n",
+    "If parseable rate is below 30%, SFT is mandatory. The result is logged to\n",
+    "W&B under `format_test/*`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!python -m scripts.format_test \\\n",
+    "    --backend unsloth \\\n",
+    "    --model Qwen/Qwen2.5-3B-Instruct \\\n",
+    "    --syndromes 10 --samples-per 3 \\\n",
+    "    --out data/format_test.json \\\n",
+    "    --report-to wandb \\\n",
+    "    --wandb-group {EXPERIMENT_GROUP}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 5. Generate SFT data (5,000 syndromes, ~5 min)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!python -m scripts.generate_sft_data --n 5000 --out data/sft_dataset.jsonl"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 6. SFT warm-up (~30 min on T4)\n",
+    "\n",
+    "Logs `sft/loss`, `sft/parse_success_rate`, and a `sft/generations` table\n",
+    "every 100 steps. Uploads the LoRA adapter dir as a W&B artifact at the end."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!python -m scripts.train_sft \\\n",
+    "    --dataset data/sft_dataset.jsonl \\\n",
+    "    --output checkpoints/sft_warmup \\\n",
+    "    --report-to wandb \\\n",
+    "    --wandb-group {EXPERIMENT_GROUP} \\\n",
+    "    --wandb-run-name sft-warmup-{EXPERIMENT_GROUP} \\\n",
+    "    --wandb-notes 'SFT warm-up on PyMatching-derived syndromes' \\\n",
+    "    --sample-every 100 --sample-count 4"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 7. SFT validation gate (Section 6.2)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!python -m scripts.eval \\\n",
+    "    --adapter checkpoints/sft_warmup \\\n",
+    "    --episodes 100 \\\n",
+    "    --out data/sft_eval.json \\\n",
+    "    --report-to wandb \\\n",
+    "    --wandb-group {EXPERIMENT_GROUP} \\\n",
+    "    --wandb-run-name eval-sft-{EXPERIMENT_GROUP}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 8. GRPO RL training (~22 hours on T4)\n",
+    "\n",
+    "Logs `rl/reward/<component>_mean|std|min|max` for each of the five reward\n",
+    "components, `rl/parse/*`, `rl/curriculum/*`, plus a generation table and\n",
+    "an in-loop greedy eval every 200 steps. Uploads the trained adapter as a\n",
+    "W&B artifact at the end.\n",
+    "\n",
+    "Adjust `--steps` if your time budget is tighter (~250 steps/hour on a T4)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!python -m scripts.train_grpo \\\n",
+    "    --sft-checkpoint checkpoints/sft_warmup \\\n",
+    "    --output checkpoints/grpo \\\n",
+    "    --steps 2000 \\\n",
+    "    --report-to wandb \\\n",
+    "    --wandb-group {EXPERIMENT_GROUP} \\\n",
+    "    --wandb-run-name grpo-{EXPERIMENT_GROUP} \\\n",
+    "    --wandb-notes 'GRPO with 5 verifiable rewards' \\\n",
+    "    --sample-every 50 --sample-n 8 \\\n",
+    "    --inloop-eval-every 200 --inloop-eval-episodes 50"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 9. Final evaluation + headline plots"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!python -m scripts.eval \\\n",
+    "    --adapter checkpoints/grpo --episodes 500 \\\n",
+    "    --out data/grpo_eval.json \\\n",
+    "    --report-to wandb \\\n",
+    "    --wandb-group {EXPERIMENT_GROUP} \\\n",
+    "    --wandb-run-name eval-grpo-{EXPERIMENT_GROUP}\n",
+    "\n",
+    "!python -m scripts.baseline_policies --episodes 500 --out data/baseline_results.json\n",
+    "!python -m scripts.plot_results --baselines data/baseline_results.json --out-dir figures\n",
+    "!python -m scripts.animate_grid --frames 50"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 10. Optional: Willow real-chip cross-validation (Section 8)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Manually download from https://zenodo.org/record/13359217 and place at data/willow_d3.dem\n",
+    "!python -m scripts.willow_validation --dem data/willow_d3.dem --episodes 1000"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 11. Push to Hugging Face Spaces\n",
+    "\n",
+    "After successful training, push the env + adapters to a Space."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from huggingface_hub import HfApi, login\n",
+    "login()  # paste your HF token\n",
+    "api = HfApi()\n",
+    "# Replace with your Space repo id.\n",
+    "api.upload_folder(folder_path='.', repo_id='your-team/qubit-medic', repo_type='space')"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": "3.11"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}