Spaces:
Sleeping
Sleeping
feat: standardizing inference output format and adding Setup Guide
Browse files- Setup_Guide.md +124 -0
- frontend/script.js +12 -6
- inference.py +3 -4
Setup_Guide.md
ADDED
|
@@ -0,0 +1,124 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# UI LAYOUT OPTIMIZER OPENENV
|
| 2 |
+
**Complete Beginner's Setup & Run Guide**
|
| 3 |
+
*Meta x PyTorch OpenEnv Hackathon 2026*
|
| 4 |
+
Step-by-step from zero to running RL agent
|
| 5 |
+
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
## What You Will Build
|
| 9 |
+
This guide walks you through running the **UI Layout Optimizer Simulator** — a Reinforcement Learning environment built on Meta's OpenEnv framework. By the end, you'll have a real RL agent that manipulates digital checkout components (button sizes, form lengths, wizard steps) to maximize user conversion and minimize cart abandonment, using a HuggingFace LLM-driven agent as a baseline.
|
| 10 |
+
|
| 11 |
+
## Prerequisites — What You Need Before Starting
|
| 12 |
+
Check all of these before going to the next step:
|
| 13 |
+
- Python 3.10 or newer (3.11 recommended)
|
| 14 |
+
- `pip` (comes with Python)
|
| 15 |
+
- `git`
|
| 16 |
+
- A terminal / command prompt
|
| 17 |
+
- A HuggingFace account (free) for LLM API access
|
| 18 |
+
- A Code Editor (VS Code recommended)
|
| 19 |
+
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
## Step 1 — Check Your Python Version
|
| 23 |
+
Python 3.10+ is required. Verify first:
|
| 24 |
+
```bash
|
| 25 |
+
python --version
|
| 26 |
+
```
|
| 27 |
+
*You should see Python 3.10.x or newer.*
|
| 28 |
+
|
| 29 |
+
## Step 2 — Download the Project
|
| 30 |
+
Clone the repository to your computer:
|
| 31 |
+
```bash
|
| 32 |
+
git clone https://github.com/Prasannakolapkar/UI-Layout-optimizer.git
|
| 33 |
+
cd UI-Layout-optimizer
|
| 34 |
+
```
|
| 35 |
+
|
| 36 |
+
## Step 3 — Create a Virtual Environment
|
| 37 |
+
A virtual environment keeps this project's packages separate from everything else on your computer.
|
| 38 |
+
|
| 39 |
+
**Windows:**
|
| 40 |
+
```bash
|
| 41 |
+
python -m venv venv
|
| 42 |
+
.\venv\Scripts\activate
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
**Mac/Linux:**
|
| 46 |
+
```bash
|
| 47 |
+
python3 -m venv venv
|
| 48 |
+
source venv/bin/activate
|
| 49 |
+
```
|
| 50 |
+
*You will see `(venv)` at the start of your terminal prompt when it's active.* **Always activate before running project commands.**
|
| 51 |
+
|
| 52 |
+
## Step 4 — Install Dependencies
|
| 53 |
+
Install all the Python packages the project needs:
|
| 54 |
+
```bash
|
| 55 |
+
pip install -r requirements.txt
|
| 56 |
+
```
|
| 57 |
+
*(This installs FastAPI, Pydantic, Uvicorn, httpx, and other core libraries).*
|
| 58 |
+
|
| 59 |
+
## Step 5 — Get Your HuggingFace API Token
|
| 60 |
+
Our baseline agent uses a HuggingFace model to route UI decisions. You need a free API token to call it.
|
| 61 |
+
1. Go to huggingface.co and sign up.
|
| 62 |
+
2. Go to **Settings > Access Tokens**.
|
| 63 |
+
3. Create a **New Token** with `Read` access.
|
| 64 |
+
4. Copy the token (it starts with `hf_`).
|
| 65 |
+
|
| 66 |
+
## Step 6 — Set Up Your Environment Variables
|
| 67 |
+
We use a `.env` file (or exported variables) to store your HuggingFace token securely.
|
| 68 |
+
|
| 69 |
+
Create a file named `.env` in the project root:
|
| 70 |
+
```ini
|
| 71 |
+
HF_TOKEN=hf_your_token_here
|
| 72 |
+
```
|
| 73 |
+
*If you don't use the LLM fallback agent, the purely mathematical `HeuristicAgent` works automatically without a token!*
|
| 74 |
+
|
| 75 |
+
## Step 7 — Understand the Project Structure
|
| 76 |
+
Before running anything, it helps to know what each file does:
|
| 77 |
+
- `env.py` - Core RL logic: `reset()`, `step()`, simulates user drops, computes rewards.
|
| 78 |
+
- `benchmark.py` - Evaluates agents over easy, medium, and hard tasks.
|
| 79 |
+
- `server/app.py` - The FastAPI environment server that exposes REST endpoints for agents and the UI.
|
| 80 |
+
- `frontend/` - Contains the HTML/JS web interface for real-time visualization.
|
| 81 |
+
- `baseline.py` & `heuristic_agent.py` - Your RL agent implementations.
|
| 82 |
+
- `openenv.yaml` - OpenEnv specification declaration for HuggingFace deployment.
|
| 83 |
+
|
| 84 |
+
## Step 8 — Run the Grader (Quickest Test)
|
| 85 |
+
The benchmark script is the fastest way to verify everything works. It calculates the leaderboard score across all difficulties:
|
| 86 |
+
```bash
|
| 87 |
+
python benchmark.py
|
| 88 |
+
```
|
| 89 |
+
**What to expect:** You'll see episodes running and evaluating the agent's performance (score, completion rate, drop rate). The internal `HeuristicAgent` correctly minimizes dropping users by acting ethically and intelligently!
|
| 90 |
+
|
| 91 |
+
## Step 9 — Start the FastAPI Local Server
|
| 92 |
+
The server exposes the environment locally so the visualizer and external agents can interact with it.
|
| 93 |
+
```bash
|
| 94 |
+
uvicorn server.app:app --reload
|
| 95 |
+
```
|
| 96 |
+
**Verify it's running:** Open a browser and go to `http://127.0.0.1:7860/` (or `8000` depending on port settings). You will see the Interactive UI Simulator!
|
| 97 |
+
|
| 98 |
+
## Step 10 — Connect an Agent (Client Usage)
|
| 99 |
+
Now let's verify our LLM baseline router connects with the logic correctly:
|
| 100 |
+
*(Open a second terminal, activate `venv`!)*
|
| 101 |
+
```bash
|
| 102 |
+
python baseline.py
|
| 103 |
+
```
|
| 104 |
+
You'll see step-by-step UI adjustments printed in the console. The agent reduces form complexity and resizes buttons to perfection.
|
| 105 |
+
|
| 106 |
+
## Step 11 — Understanding the Reward Function
|
| 107 |
+
This is the heart of the RL environment. The UI layout directly targets human psychology.
|
| 108 |
+
- **Completion Reward**: Big positive impact for moving the progress bar.
|
| 109 |
+
- **Drop Penalty (-1.0)**: Catastrophic penalty if user abandons the cart due to frustration.
|
| 110 |
+
- **Distrust Penalty (-0.2)**: Small penalty if buttons look glitchy or fields are invasive.
|
| 111 |
+
- **Ideal States**: Optimal form length is ~3, optimal steps ~2, optimal button size ~1.1x.
|
| 112 |
+
|
| 113 |
+
## Step 12 — Common Errors and Fixes
|
| 114 |
+
- `KeyError: 'grader'`: Ensure your `openenv.yaml` contains `grader: "env:UIEnv.grade_easy"` for each task. (Already patched!)
|
| 115 |
+
- `TypeError: Cannot read properties of undefined (reading 'toFixed')`: Make sure you have the latest `frontend/script.js` with the safe `fmt()` helper.
|
| 116 |
+
- `ModuleNotFoundError: No module named 'openenv'`: Ensure your `venv` is active and requirements are installed.
|
| 117 |
+
- `Connection refused`: Make sure the Uvicorn server is actively running in another tab.
|
| 118 |
+
|
| 119 |
+
## Step 13 — Deploy to HuggingFace Spaces (Optional)
|
| 120 |
+
Your code is fully OpenEnv compliant and Docker-ready!
|
| 121 |
+
1. Create a New Space on HuggingFace.
|
| 122 |
+
2. Choose **Docker** environment.
|
| 123 |
+
3. In space settings, add `HF_TOKEN` to your Secrets.
|
| 124 |
+
4. The deployment will automatically host the `FastAPI` instance and validation system globally so judges can score you.
|
frontend/script.js
CHANGED
|
@@ -280,7 +280,7 @@ async function resetEnv() {
|
|
| 280 |
dom.metricOutcome.textContent = "--";
|
| 281 |
dom.metricOutcome.className = "text-lg font-bold text-dark-400";
|
| 282 |
|
| 283 |
-
addLog(
|
| 284 |
} catch (err) {
|
| 285 |
addLog("Error: " + err.message, "negative");
|
| 286 |
}
|
|
@@ -319,8 +319,9 @@ async function stepAgent() {
|
|
| 319 |
state.done = s.done;
|
| 320 |
|
| 321 |
updateUI(s.observation, s.reward, s.info);
|
|
|
|
| 322 |
addLog(
|
| 323 |
-
`
|
| 324 |
s.reward >= 0 ? "reward" : "negative"
|
| 325 |
);
|
| 326 |
|
|
@@ -330,7 +331,9 @@ async function stepAgent() {
|
|
| 330 |
const outcome = s.info.outcome;
|
| 331 |
setEpisodeStatus(outcome === "complete" ? "DONE" : "DROPPED", outcome);
|
| 332 |
setControlsEnabled(false);
|
| 333 |
-
|
|
|
|
|
|
|
| 334 |
state._cachedSteps = null;
|
| 335 |
}
|
| 336 |
}
|
|
@@ -352,7 +355,7 @@ async function runEpisode() {
|
|
| 352 |
dom.btnRun.textContent = "Running...";
|
| 353 |
setControlsEnabled(false);
|
| 354 |
|
| 355 |
-
addLog(`
|
| 356 |
|
| 357 |
try {
|
| 358 |
const data = await api("/run_episode", "POST", { agent });
|
|
@@ -368,8 +371,9 @@ async function runEpisode() {
|
|
| 368 |
updateUI(s.observation, s.reward, s.info);
|
| 369 |
|
| 370 |
const actionLabel = s.action + (s.action_value !== null ? `(${s.action_value})` : "");
|
|
|
|
| 371 |
addLog(
|
| 372 |
-
`
|
| 373 |
s.reward >= 0 ? "reward" : "negative"
|
| 374 |
);
|
| 375 |
|
|
@@ -379,8 +383,10 @@ async function runEpisode() {
|
|
| 379 |
|
| 380 |
const outcome = data.final_outcome;
|
| 381 |
setEpisodeStatus(outcome === "complete" ? "DONE" : "DROPPED", outcome);
|
|
|
|
|
|
|
| 382 |
addLog(
|
| 383 |
-
`
|
| 384 |
"outcome"
|
| 385 |
);
|
| 386 |
|
|
|
|
| 280 |
dom.metricOutcome.textContent = "--";
|
| 281 |
dom.metricOutcome.className = "text-lg font-bold text-dark-400";
|
| 282 |
|
| 283 |
+
addLog(`[START] task=default env=ui_layout_optimizer model=${dom.agentSelect.value}`, "system");
|
| 284 |
} catch (err) {
|
| 285 |
addLog("Error: " + err.message, "negative");
|
| 286 |
}
|
|
|
|
| 319 |
state.done = s.done;
|
| 320 |
|
| 321 |
updateUI(s.observation, s.reward, s.info);
|
| 322 |
+
const errorStr = s.info.error ? s.info.error : "null";
|
| 323 |
addLog(
|
| 324 |
+
`[STEP] step=${s.info.step_count} action=${s.action} reward=${fmt(s.reward, 2)} done=${s.done} error=${errorStr}`,
|
| 325 |
s.reward >= 0 ? "reward" : "negative"
|
| 326 |
);
|
| 327 |
|
|
|
|
| 331 |
const outcome = s.info.outcome;
|
| 332 |
setEpisodeStatus(outcome === "complete" ? "DONE" : "DROPPED", outcome);
|
| 333 |
setControlsEnabled(false);
|
| 334 |
+
const success = outcome === "complete" ? "true" : "false";
|
| 335 |
+
const rewardsStr = state._cachedSteps.slice(0, state._cacheIdx).map(st => fmt(st.reward, 2)).join(",");
|
| 336 |
+
addLog(`[END] success=${success} steps=${s.info.step_count} rewards=${rewardsStr}`, "outcome");
|
| 337 |
state._cachedSteps = null;
|
| 338 |
}
|
| 339 |
}
|
|
|
|
| 355 |
dom.btnRun.textContent = "Running...";
|
| 356 |
setControlsEnabled(false);
|
| 357 |
|
| 358 |
+
addLog(`[START] task=default env=ui_layout_optimizer model=${agent}`, "system");
|
| 359 |
|
| 360 |
try {
|
| 361 |
const data = await api("/run_episode", "POST", { agent });
|
|
|
|
| 371 |
updateUI(s.observation, s.reward, s.info);
|
| 372 |
|
| 373 |
const actionLabel = s.action + (s.action_value !== null ? `(${s.action_value})` : "");
|
| 374 |
+
const errorStr = s.info.error ? s.info.error : "null";
|
| 375 |
addLog(
|
| 376 |
+
`[STEP] step=${s.info.step_count} action=${actionLabel} reward=${fmt(s.reward, 2)} done=${s.done} error=${errorStr}`,
|
| 377 |
s.reward >= 0 ? "reward" : "negative"
|
| 378 |
);
|
| 379 |
|
|
|
|
| 383 |
|
| 384 |
const outcome = data.final_outcome;
|
| 385 |
setEpisodeStatus(outcome === "complete" ? "DONE" : "DROPPED", outcome);
|
| 386 |
+
const success = outcome === "complete" ? "true" : "false";
|
| 387 |
+
const rewardsStr = data.steps.map(st => fmt(st.reward, 2)).join(",");
|
| 388 |
addLog(
|
| 389 |
+
`[END] success=${success} steps=${data.total_steps} rewards=${rewardsStr}`,
|
| 390 |
"outcome"
|
| 391 |
);
|
| 392 |
|
inference.py
CHANGED
|
@@ -23,10 +23,10 @@ def log_step(step: int, action: str, reward: float, done: bool, error: Optional[
|
|
| 23 |
flush=True,
|
| 24 |
)
|
| 25 |
|
| 26 |
-
def log_end(success: bool, steps: int,
|
| 27 |
rewards_str = ",".join(f"{r:.2f}" for r in rewards)
|
| 28 |
success_val = str(success).lower()
|
| 29 |
-
print(f"[END] success={success_val} steps={steps}
|
| 30 |
|
| 31 |
def run_inference(task_id: str = "easy") -> None:
|
| 32 |
"""
|
|
@@ -89,8 +89,7 @@ def run_inference(task_id: str = "easy") -> None:
|
|
| 89 |
|
| 90 |
# Enforce strict (0,1) bound
|
| 91 |
score = clamp_score(score)
|
| 92 |
-
|
| 93 |
-
log_end(success=completed, steps=step_count, score=score, rewards=rewards)
|
| 94 |
|
| 95 |
if __name__ == "__main__":
|
| 96 |
parser = argparse.ArgumentParser(description="Run UIEnv Inference")
|
|
|
|
| 23 |
flush=True,
|
| 24 |
)
|
| 25 |
|
| 26 |
+
def log_end(success: bool, steps: int, rewards: List[float]) -> None:
|
| 27 |
rewards_str = ",".join(f"{r:.2f}" for r in rewards)
|
| 28 |
success_val = str(success).lower()
|
| 29 |
+
print(f"[END] success={success_val} steps={steps} rewards={rewards_str}", flush=True)
|
| 30 |
|
| 31 |
def run_inference(task_id: str = "easy") -> None:
|
| 32 |
"""
|
|
|
|
| 89 |
|
| 90 |
# Enforce strict (0,1) bound
|
| 91 |
score = clamp_score(score)
|
| 92 |
+
log_end(success=completed, steps=step_count, rewards=rewards)
|
|
|
|
| 93 |
|
| 94 |
if __name__ == "__main__":
|
| 95 |
parser = argparse.ArgumentParser(description="Run UIEnv Inference")
|