Spaces:

Pratap-K
/

SmartPayEnv

Sleeping

Pratap-K commited on 13 days ago

Commit

2a81a82

1 Parent(s): 5b9b298

Update training

Files changed (1) hide show

notebooks/train_smartpayenev.ipynb CHANGED Viewed

@@ -5,13 +5,13 @@
       "id": "1035bc6e",
       "metadata": {},
       "source": [
-        "# SmartPayEnv Theme-4 Judge Repro — Co-Evolving Defender vs Fraud (GRPO + Unsloth + TRL)\n",
         "\n",
-        "Self-contained Colab notebook. **No imports from this repo.** Uses only the deployed\n",
         "Hugging Face Space's HTTP endpoints: `/health`, `/reset`, `/step`,\n",
         "`/reset_seeded`, `/configure_adversary`.\n",
         "\n",
-        "### What's new (vs. a vanilla GRPO loop)\n",
         "\n",
         "This notebook implements **true co-evolution** between two learning agents:\n",
         "\n",

       "id": "1035bc6e",
       "metadata": {},
       "source": [
+        "# SmartPayEnv\n",
+        "\n",
         "\n",
         "Hugging Face Space's HTTP endpoints: `/health`, `/reset`, `/step`,\n",
         "`/reset_seeded`, `/configure_adversary`.\n",
         "\n",
+        "### What's implemented\n",
         "\n",
         "This notebook implements **true co-evolution** between two learning agents:\n",
         "\n",