Spaces:
Runtime error
Runtime error
| """Tiny Gradio landing page for the OpenSleuth Colab notebook Space. | |
| The actual training happens in the notebook (`train_opensleuth_grpo.ipynb` in | |
| this same repo, downloadable from the Files tab). This app just renders a | |
| clickable Open-In-Colab card so visitors can launch it in one click. | |
| """ | |
| from __future__ import annotations | |
| import gradio as gr | |
| NOTEBOOK_PATH = "train_opensleuth_grpo.ipynb" | |
| SPACE_ID = "anugrah55/opensleuth-colab" | |
| COLAB_URL = ( | |
| "https://colab.research.google.com/#fileId=" | |
| f"https%3A//huggingface.co/spaces/{SPACE_ID}/blob/main/{NOTEBOOK_PATH}" | |
| ) | |
| LANDING_MD = f""" | |
| # OpenSleuth β Colab quickstart | |
| []({COLAB_URL}) | |
| OpenSleuth is an *Algorithmic Detective* RL environment. An LLM agent reverse-engineers an unknown black-box Python function by probing it and then submitting a Python replica. The env fuzz-tests the submission against the hidden reference (with a complexity penalty) and returns a scalar reward. | |
| This Space hosts the **minimum reproducible Colab notebook** for training an | |
| agent against the live env Space using **HF TRL's `GRPOTrainer`** + **bnb-4bit** | |
| + **LoRA** on a free-tier Colab T4. End-to-end runtime: ~15 β 25 minutes. | |
| ### One-click training | |
| 1. Click the **Open in Colab** badge above (or grab `{NOTEBOOK_PATH}` from the **Files** tab and upload it to Colab manually). | |
| 2. In Colab: `Runtime β Change runtime type β GPU β T4`. | |
| 3. `Runtime β Run all`. | |
| ### Defaults | |
| | Knob | Value | | |
| |------|-------| | |
| | Model | `Qwen/Qwen2.5-0.5B-Instruct` | | |
| | Quant | bnb-4bit (nf4 + double-quant) | | |
| | LoRA | r=16, alpha=32, q/k/v/o | | |
| | Tasks | all 15 from `anugrah55/opensleuth-tasks` | | |
| | GRPO `num_generations` | 4 | | |
| | Epochs | 1 | | |
| ### Links | |
| - **Env Space (REST API the notebook calls):** https://huggingface.co/spaces/anugrah55/opensleuth-env-gemini-cli | |
| - **Training Space (full 3B retrain):** https://huggingface.co/spaces/anugrah55/opensleuth-training-gemini-cli | |
| - **Open-ended task catalog:** https://huggingface.co/datasets/anugrah55/opensleuth-tasks | |
| """ | |
| def _open_colab() -> str: | |
| return f"Opening Colab: {COLAB_URL}" | |
| with gr.Blocks(title="OpenSleuth β Colab quickstart") as demo: | |
| gr.Markdown(LANDING_MD) | |
| with gr.Row(): | |
| gr.Button( | |
| value="Open in Google Colab", | |
| link=COLAB_URL, | |
| variant="primary", | |
| ) | |
| gr.Button( | |
| value="View notebook in Files tab", | |
| link=f"https://huggingface.co/spaces/{SPACE_ID}/blob/main/{NOTEBOOK_PATH}", | |
| ) | |
| if __name__ == "__main__": | |
| demo.launch() | |