Spaces:
Runtime error
Runtime error
Update Colab badge to point at this Space
Browse files- train_opensleuth_grpo.ipynb +14 -14
train_opensleuth_grpo.ipynb
CHANGED
|
@@ -2,12 +2,12 @@
|
|
| 2 |
"cells": [
|
| 3 |
{
|
| 4 |
"cell_type": "markdown",
|
| 5 |
-
"id": "
|
| 6 |
"metadata": {},
|
| 7 |
"source": [
|
| 8 |
"# OpenSleuth — GRPO training on a free-tier Colab T4\n",
|
| 9 |
"\n",
|
| 10 |
-
"[](https://colab.research.google.com/
|
| 11 |
"\n",
|
| 12 |
"**OpenSleuth** is an *Algorithmic Detective* RL environment. An LLM agent reverse-engineers an unknown black-box Python function by **probing** it with inputs and then **submitting** a Python replica. The environment scores submissions by domain-aware fuzz-testing against the hidden reference, with a complexity penalty so the agent can't just memorise its probes inside a giant `if/else`.\n",
|
| 13 |
"\n",
|
|
@@ -36,7 +36,7 @@
|
|
| 36 |
{
|
| 37 |
"cell_type": "code",
|
| 38 |
"execution_count": null,
|
| 39 |
-
"id": "
|
| 40 |
"metadata": {},
|
| 41 |
"outputs": [],
|
| 42 |
"source": [
|
|
@@ -57,7 +57,7 @@
|
|
| 57 |
{
|
| 58 |
"cell_type": "code",
|
| 59 |
"execution_count": null,
|
| 60 |
-
"id": "
|
| 61 |
"metadata": {},
|
| 62 |
"outputs": [],
|
| 63 |
"source": [
|
|
@@ -70,7 +70,7 @@
|
|
| 70 |
{
|
| 71 |
"cell_type": "code",
|
| 72 |
"execution_count": null,
|
| 73 |
-
"id": "
|
| 74 |
"metadata": {},
|
| 75 |
"outputs": [],
|
| 76 |
"source": [
|
|
@@ -148,7 +148,7 @@
|
|
| 148 |
{
|
| 149 |
"cell_type": "code",
|
| 150 |
"execution_count": null,
|
| 151 |
-
"id": "
|
| 152 |
"metadata": {},
|
| 153 |
"outputs": [],
|
| 154 |
"source": [
|
|
@@ -260,7 +260,7 @@
|
|
| 260 |
{
|
| 261 |
"cell_type": "code",
|
| 262 |
"execution_count": null,
|
| 263 |
-
"id": "
|
| 264 |
"metadata": {},
|
| 265 |
"outputs": [],
|
| 266 |
"source": [
|
|
@@ -537,7 +537,7 @@
|
|
| 537 |
{
|
| 538 |
"cell_type": "code",
|
| 539 |
"execution_count": null,
|
| 540 |
-
"id": "
|
| 541 |
"metadata": {},
|
| 542 |
"outputs": [],
|
| 543 |
"source": [
|
|
@@ -574,7 +574,7 @@
|
|
| 574 |
{
|
| 575 |
"cell_type": "code",
|
| 576 |
"execution_count": null,
|
| 577 |
-
"id": "
|
| 578 |
"metadata": {},
|
| 579 |
"outputs": [],
|
| 580 |
"source": [
|
|
@@ -617,7 +617,7 @@
|
|
| 617 |
{
|
| 618 |
"cell_type": "code",
|
| 619 |
"execution_count": null,
|
| 620 |
-
"id": "
|
| 621 |
"metadata": {},
|
| 622 |
"outputs": [],
|
| 623 |
"source": [
|
|
@@ -687,7 +687,7 @@
|
|
| 687 |
{
|
| 688 |
"cell_type": "code",
|
| 689 |
"execution_count": null,
|
| 690 |
-
"id": "
|
| 691 |
"metadata": {},
|
| 692 |
"outputs": [],
|
| 693 |
"source": [
|
|
@@ -702,7 +702,7 @@
|
|
| 702 |
{
|
| 703 |
"cell_type": "code",
|
| 704 |
"execution_count": null,
|
| 705 |
-
"id": "
|
| 706 |
"metadata": {},
|
| 707 |
"outputs": [],
|
| 708 |
"source": [
|
|
@@ -724,7 +724,7 @@
|
|
| 724 |
{
|
| 725 |
"cell_type": "code",
|
| 726 |
"execution_count": null,
|
| 727 |
-
"id": "
|
| 728 |
"metadata": {},
|
| 729 |
"outputs": [],
|
| 730 |
"source": [
|
|
@@ -775,7 +775,7 @@
|
|
| 775 |
},
|
| 776 |
{
|
| 777 |
"cell_type": "markdown",
|
| 778 |
-
"id": "
|
| 779 |
"metadata": {},
|
| 780 |
"source": [
|
| 781 |
"## Next steps\n",
|
|
|
|
| 2 |
"cells": [
|
| 3 |
{
|
| 4 |
"cell_type": "markdown",
|
| 5 |
+
"id": "52f6a469",
|
| 6 |
"metadata": {},
|
| 7 |
"source": [
|
| 8 |
"# OpenSleuth — GRPO training on a free-tier Colab T4\n",
|
| 9 |
"\n",
|
| 10 |
+
"[](https://colab.research.google.com/#fileId=https%3A//huggingface.co/spaces/anugrah55/opensleuth-colab/blob/main/train_opensleuth_grpo.ipynb)\n",
|
| 11 |
"\n",
|
| 12 |
"**OpenSleuth** is an *Algorithmic Detective* RL environment. An LLM agent reverse-engineers an unknown black-box Python function by **probing** it with inputs and then **submitting** a Python replica. The environment scores submissions by domain-aware fuzz-testing against the hidden reference, with a complexity penalty so the agent can't just memorise its probes inside a giant `if/else`.\n",
|
| 13 |
"\n",
|
|
|
|
| 36 |
{
|
| 37 |
"cell_type": "code",
|
| 38 |
"execution_count": null,
|
| 39 |
+
"id": "765d1f38",
|
| 40 |
"metadata": {},
|
| 41 |
"outputs": [],
|
| 42 |
"source": [
|
|
|
|
| 57 |
{
|
| 58 |
"cell_type": "code",
|
| 59 |
"execution_count": null,
|
| 60 |
+
"id": "2bfa6d1e",
|
| 61 |
"metadata": {},
|
| 62 |
"outputs": [],
|
| 63 |
"source": [
|
|
|
|
| 70 |
{
|
| 71 |
"cell_type": "code",
|
| 72 |
"execution_count": null,
|
| 73 |
+
"id": "fb82de78",
|
| 74 |
"metadata": {},
|
| 75 |
"outputs": [],
|
| 76 |
"source": [
|
|
|
|
| 148 |
{
|
| 149 |
"cell_type": "code",
|
| 150 |
"execution_count": null,
|
| 151 |
+
"id": "73e199af",
|
| 152 |
"metadata": {},
|
| 153 |
"outputs": [],
|
| 154 |
"source": [
|
|
|
|
| 260 |
{
|
| 261 |
"cell_type": "code",
|
| 262 |
"execution_count": null,
|
| 263 |
+
"id": "953947fc",
|
| 264 |
"metadata": {},
|
| 265 |
"outputs": [],
|
| 266 |
"source": [
|
|
|
|
| 537 |
{
|
| 538 |
"cell_type": "code",
|
| 539 |
"execution_count": null,
|
| 540 |
+
"id": "3236b1da",
|
| 541 |
"metadata": {},
|
| 542 |
"outputs": [],
|
| 543 |
"source": [
|
|
|
|
| 574 |
{
|
| 575 |
"cell_type": "code",
|
| 576 |
"execution_count": null,
|
| 577 |
+
"id": "ccdba521",
|
| 578 |
"metadata": {},
|
| 579 |
"outputs": [],
|
| 580 |
"source": [
|
|
|
|
| 617 |
{
|
| 618 |
"cell_type": "code",
|
| 619 |
"execution_count": null,
|
| 620 |
+
"id": "fffee452",
|
| 621 |
"metadata": {},
|
| 622 |
"outputs": [],
|
| 623 |
"source": [
|
|
|
|
| 687 |
{
|
| 688 |
"cell_type": "code",
|
| 689 |
"execution_count": null,
|
| 690 |
+
"id": "871b7fc9",
|
| 691 |
"metadata": {},
|
| 692 |
"outputs": [],
|
| 693 |
"source": [
|
|
|
|
| 702 |
{
|
| 703 |
"cell_type": "code",
|
| 704 |
"execution_count": null,
|
| 705 |
+
"id": "11008e19",
|
| 706 |
"metadata": {},
|
| 707 |
"outputs": [],
|
| 708 |
"source": [
|
|
|
|
| 724 |
{
|
| 725 |
"cell_type": "code",
|
| 726 |
"execution_count": null,
|
| 727 |
+
"id": "3c3b99bb",
|
| 728 |
"metadata": {},
|
| 729 |
"outputs": [],
|
| 730 |
"source": [
|
|
|
|
| 775 |
},
|
| 776 |
{
|
| 777 |
"cell_type": "markdown",
|
| 778 |
+
"id": "cdbdfdd7",
|
| 779 |
"metadata": {},
|
| 780 |
"source": [
|
| 781 |
"## Next steps\n",
|