Spaces:

Imsachin010
/

salespath-env

Runtime error

App Files Files Community

Imsachin010 commited on 12 days ago

Commit

b8ede5e

1 Parent(s): dd9667a

Update blog with 0.5B results and project metrics

Browse files

Files changed (1) hide show

training/traingrpo.ipynb +0 -102

training/traingrpo.ipynb CHANGED Viewed

@@ -19,26 +19,6 @@
         "7. Run **Cell 5** (reward graph)"
       ]
     },
-    {
-      "cell_type": "markdown",
-      "id": "f9d908a8",
-      "metadata": {},
-      "source": [
-        "# SalesPath — Colab Training Notebook (7B Scale-Up)\n",
-        "\n",
-        "**Stack:** OpenEnv + GRPO (TRL) + Unsloth + Qwen 2.5 7B\n",
-        "\n",
-        "**Instructions:**\n",
-        "1. Runtime → Change runtime type → **T4 GPU**\n",
-        "2. Add `HF_TOKEN` in Colab Secrets (left sidebar 🔑)\n",
-        "3. Run **Cell 1** once (installs + clones)\n",
-        "4. Run **Cell 2** (starts server + validates)\n",
-        "5. Skip Cell 3 & 4 (already validated with 0.5B)\n",
-        "6. Run **Cell 5** (GRPO training - 150 steps health check)\n",
-        "7. Run **Cell 6** (reward graph)\n",
-        "8. Run **Cell 7** (Push to HF)"
-      ]
-    },
     {
       "cell_type": "code",
       "execution_count": 1,
@@ -420,88 +400,6 @@
         "    --output-dir /content/salespath_out \\\n",
         "    --logging-steps 10"
       ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "id": "13db57ec",
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "# ============================================================\n",
-        "# CELL 5 — GRPO Training (gradient updates via TRL)\n",
-        "# ============================================================\n",
-        "# import os\n",
-        "# os.chdir(\"/content/salespath_env\")\n",
-        "\n",
-        "# grpo_cmd = (\n",
-        "#     \"python -m training.grpo_train \"\n",
-        "#     \"--mode grpo \"\n",
-        "#     \"--model-name Qwen/Qwen2.5-0.5B-Instruct \"\n",
-        "#     \"--grpo-steps 100 \"\n",
-        "#     \"--grpo-dataset-size 256 \"\n",
-        "#     \"--num-generations 4 \"\n",
-        "#     \"--max-completion-length 64 \"\n",
-        "#     \"--output-dir /content/salespath_out \"\n",
-        "#     \"--logging-steps 5\"\n",
-        "# )\n",
-        "# !{grpo_cmd}\n",
-        "\n"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "id": "e33864a1",
-      "metadata": {},
-      "source": [
-        "## Final Push to HuggingFace\n",
-        "Run this after you have confirmed the 150 (or 300+) steps look good."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "id": "b2a78334",
-      "metadata": {},
-      "outputs": [
-        {
-          "ename": "SyntaxError",
-          "evalue": "incomplete input (1350301498.py, line 27)",
-          "output_type": "error",
-          "traceback": [
-            "\u001b[0;36m  File \u001b[0;32m\"/tmp/ipykernel_17054/1350301498.py\"\u001b[0;36m, line \u001b[0;32m27\u001b[0m\n\u001b[0;31m    \"\"\"\u001b[0m\n\u001b[0m    ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m incomplete input\n"
-          ]
-        }
-      ],
-      "source": [
-        "# ============================================================\n",
-        "# CELL 7 — Push Merged Model to HuggingFace\n",
-        "# ============================================================\n",
-        "import os\n",
-        "os.chdir(\"/content/salespath_env\")\n",
-        "\n",
-        "# We load the final checkpoint and push it.\n",
-        "hf_token = os.environ.get(\"HF_TOKEN\")\n",
-        "if not hf_token:\n",
-        "    print(\"⚠️ HF_TOKEN not found in secrets. Cannot push.\")\n",
-        "else:\n",
-        "    !python -c \"\"\"\n",
-        "import os\n",
-        "from unsloth import FastLanguageModel\n",
-        "model, tokenizer = FastLanguageModel.from_pretrained(\n",
-        "    '/content/salespath_out/grpo_final',\n",
-        "    max_seq_length=2048,\n",
-        "    load_in_4bit=True,\n",
-        ")\n",
-        "model.push_to_hub_merged(\n",
-        "    'Imsachin010/salespath-qwen25-7b',\n",
-        "    tokenizer,\n",
-        "    save_method='merged_16bit',\n",
-        "    token=os.environ.get('HF_TOKEN')\n",
-        ")\n",
-        "print('✅ Successfully pushed to HF!')\n",
-        "\"\"\""
-      ]
     }
   ],
   "metadata": {

         "7. Run **Cell 5** (reward graph)"
       ]
     },
     {
       "cell_type": "code",
       "execution_count": 1,
         "    --output-dir /content/salespath_out \\\n",
         "    --logging-steps 10"
       ]
     }
   ],
   "metadata": {