Pratap-K commited on
Commit
2a81a82
·
1 Parent(s): 5b9b298

Update training

Browse files
Files changed (1) hide show
  1. notebooks/train_smartpayenev.ipynb +3 -3
notebooks/train_smartpayenev.ipynb CHANGED
@@ -5,13 +5,13 @@
5
  "id": "1035bc6e",
6
  "metadata": {},
7
  "source": [
8
- "# SmartPayEnv Theme-4 Judge Repro — Co-Evolving Defender vs Fraud (GRPO + Unsloth + TRL)\n",
 
9
  "\n",
10
- "Self-contained Colab notebook. **No imports from this repo.** Uses only the deployed\n",
11
  "Hugging Face Space's HTTP endpoints: `/health`, `/reset`, `/step`,\n",
12
  "`/reset_seeded`, `/configure_adversary`.\n",
13
  "\n",
14
- "### What's new (vs. a vanilla GRPO loop)\n",
15
  "\n",
16
  "This notebook implements **true co-evolution** between two learning agents:\n",
17
  "\n",
 
5
  "id": "1035bc6e",
6
  "metadata": {},
7
  "source": [
8
+ "# SmartPayEnv\n",
9
+ "\n",
10
  "\n",
 
11
  "Hugging Face Space's HTTP endpoints: `/health`, `/reset`, `/step`,\n",
12
  "`/reset_seeded`, `/configure_adversary`.\n",
13
  "\n",
14
+ "### What's implemented\n",
15
  "\n",
16
  "This notebook implements **true co-evolution** between two learning agents:\n",
17
  "\n",