Kolaps27 commited on
Commit
3ccdc7d
·
1 Parent(s): c471af0

feat: standardizing inference output format and adding Setup Guide

Browse files
Files changed (3) hide show
  1. Setup_Guide.md +124 -0
  2. frontend/script.js +12 -6
  3. inference.py +3 -4
Setup_Guide.md ADDED
@@ -0,0 +1,124 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # UI LAYOUT OPTIMIZER OPENENV
2
+ **Complete Beginner's Setup & Run Guide**
3
+ *Meta x PyTorch OpenEnv Hackathon 2026*
4
+ Step-by-step from zero to running RL agent
5
+
6
+ ---
7
+
8
+ ## What You Will Build
9
+ This guide walks you through running the **UI Layout Optimizer Simulator** — a Reinforcement Learning environment built on Meta's OpenEnv framework. By the end, you'll have a real RL agent that manipulates digital checkout components (button sizes, form lengths, wizard steps) to maximize user conversion and minimize cart abandonment, using a HuggingFace LLM-driven agent as a baseline.
10
+
11
+ ## Prerequisites — What You Need Before Starting
12
+ Check all of these before going to the next step:
13
+ - Python 3.10 or newer (3.11 recommended)
14
+ - `pip` (comes with Python)
15
+ - `git`
16
+ - A terminal / command prompt
17
+ - A HuggingFace account (free) for LLM API access
18
+ - A Code Editor (VS Code recommended)
19
+
20
+ ---
21
+
22
+ ## Step 1 — Check Your Python Version
23
+ Python 3.10+ is required. Verify first:
24
+ ```bash
25
+ python --version
26
+ ```
27
+ *You should see Python 3.10.x or newer.*
28
+
29
+ ## Step 2 — Download the Project
30
+ Clone the repository to your computer:
31
+ ```bash
32
+ git clone https://github.com/Prasannakolapkar/UI-Layout-optimizer.git
33
+ cd UI-Layout-optimizer
34
+ ```
35
+
36
+ ## Step 3 — Create a Virtual Environment
37
+ A virtual environment keeps this project's packages separate from everything else on your computer.
38
+
39
+ **Windows:**
40
+ ```bash
41
+ python -m venv venv
42
+ .\venv\Scripts\activate
43
+ ```
44
+
45
+ **Mac/Linux:**
46
+ ```bash
47
+ python3 -m venv venv
48
+ source venv/bin/activate
49
+ ```
50
+ *You will see `(venv)` at the start of your terminal prompt when it's active.* **Always activate before running project commands.**
51
+
52
+ ## Step 4 — Install Dependencies
53
+ Install all the Python packages the project needs:
54
+ ```bash
55
+ pip install -r requirements.txt
56
+ ```
57
+ *(This installs FastAPI, Pydantic, Uvicorn, httpx, and other core libraries).*
58
+
59
+ ## Step 5 — Get Your HuggingFace API Token
60
+ Our baseline agent uses a HuggingFace model to route UI decisions. You need a free API token to call it.
61
+ 1. Go to huggingface.co and sign up.
62
+ 2. Go to **Settings > Access Tokens**.
63
+ 3. Create a **New Token** with `Read` access.
64
+ 4. Copy the token (it starts with `hf_`).
65
+
66
+ ## Step 6 — Set Up Your Environment Variables
67
+ We use a `.env` file (or exported variables) to store your HuggingFace token securely.
68
+
69
+ Create a file named `.env` in the project root:
70
+ ```ini
71
+ HF_TOKEN=hf_your_token_here
72
+ ```
73
+ *If you don't use the LLM fallback agent, the purely mathematical `HeuristicAgent` works automatically without a token!*
74
+
75
+ ## Step 7 — Understand the Project Structure
76
+ Before running anything, it helps to know what each file does:
77
+ - `env.py` - Core RL logic: `reset()`, `step()`, simulates user drops, computes rewards.
78
+ - `benchmark.py` - Evaluates agents over easy, medium, and hard tasks.
79
+ - `server/app.py` - The FastAPI environment server that exposes REST endpoints for agents and the UI.
80
+ - `frontend/` - Contains the HTML/JS web interface for real-time visualization.
81
+ - `baseline.py` & `heuristic_agent.py` - Your RL agent implementations.
82
+ - `openenv.yaml` - OpenEnv specification declaration for HuggingFace deployment.
83
+
84
+ ## Step 8 — Run the Grader (Quickest Test)
85
+ The benchmark script is the fastest way to verify everything works. It calculates the leaderboard score across all difficulties:
86
+ ```bash
87
+ python benchmark.py
88
+ ```
89
+ **What to expect:** You'll see episodes running and evaluating the agent's performance (score, completion rate, drop rate). The internal `HeuristicAgent` correctly minimizes dropping users by acting ethically and intelligently!
90
+
91
+ ## Step 9 — Start the FastAPI Local Server
92
+ The server exposes the environment locally so the visualizer and external agents can interact with it.
93
+ ```bash
94
+ uvicorn server.app:app --reload
95
+ ```
96
+ **Verify it's running:** Open a browser and go to `http://127.0.0.1:7860/` (or `8000` depending on port settings). You will see the Interactive UI Simulator!
97
+
98
+ ## Step 10 — Connect an Agent (Client Usage)
99
+ Now let's verify our LLM baseline router connects with the logic correctly:
100
+ *(Open a second terminal, activate `venv`!)*
101
+ ```bash
102
+ python baseline.py
103
+ ```
104
+ You'll see step-by-step UI adjustments printed in the console. The agent reduces form complexity and resizes buttons to perfection.
105
+
106
+ ## Step 11 — Understanding the Reward Function
107
+ This is the heart of the RL environment. The UI layout directly targets human psychology.
108
+ - **Completion Reward**: Big positive impact for moving the progress bar.
109
+ - **Drop Penalty (-1.0)**: Catastrophic penalty if user abandons the cart due to frustration.
110
+ - **Distrust Penalty (-0.2)**: Small penalty if buttons look glitchy or fields are invasive.
111
+ - **Ideal States**: Optimal form length is ~3, optimal steps ~2, optimal button size ~1.1x.
112
+
113
+ ## Step 12 — Common Errors and Fixes
114
+ - `KeyError: 'grader'`: Ensure your `openenv.yaml` contains `grader: "env:UIEnv.grade_easy"` for each task. (Already patched!)
115
+ - `TypeError: Cannot read properties of undefined (reading 'toFixed')`: Make sure you have the latest `frontend/script.js` with the safe `fmt()` helper.
116
+ - `ModuleNotFoundError: No module named 'openenv'`: Ensure your `venv` is active and requirements are installed.
117
+ - `Connection refused`: Make sure the Uvicorn server is actively running in another tab.
118
+
119
+ ## Step 13 — Deploy to HuggingFace Spaces (Optional)
120
+ Your code is fully OpenEnv compliant and Docker-ready!
121
+ 1. Create a New Space on HuggingFace.
122
+ 2. Choose **Docker** environment.
123
+ 3. In space settings, add `HF_TOKEN` to your Secrets.
124
+ 4. The deployment will automatically host the `FastAPI` instance and validation system globally so judges can score you.
frontend/script.js CHANGED
@@ -280,7 +280,7 @@ async function resetEnv() {
280
  dom.metricOutcome.textContent = "--";
281
  dom.metricOutcome.className = "text-lg font-bold text-dark-400";
282
 
283
- addLog("Environment reset. Episode started.", "system");
284
  } catch (err) {
285
  addLog("Error: " + err.message, "negative");
286
  }
@@ -319,8 +319,9 @@ async function stepAgent() {
319
  state.done = s.done;
320
 
321
  updateUI(s.observation, s.reward, s.info);
 
322
  addLog(
323
- `Step ${s.info.step_count}: ${s.action} -> reward=${s.reward >= 0 ? "+" : ""}${fmt(s.reward, 3)} outcome=${s.info.outcome}`,
324
  s.reward >= 0 ? "reward" : "negative"
325
  );
326
 
@@ -330,7 +331,9 @@ async function stepAgent() {
330
  const outcome = s.info.outcome;
331
  setEpisodeStatus(outcome === "complete" ? "DONE" : "DROPPED", outcome);
332
  setControlsEnabled(false);
333
- addLog(`Episode ended: ${outcome}. Total reward: ${fmt(state.totalReward, 3)}`, "outcome");
 
 
334
  state._cachedSteps = null;
335
  }
336
  }
@@ -352,7 +355,7 @@ async function runEpisode() {
352
  dom.btnRun.textContent = "Running...";
353
  setControlsEnabled(false);
354
 
355
- addLog(`--- Running full episode with ${agent} agent ---`, "system");
356
 
357
  try {
358
  const data = await api("/run_episode", "POST", { agent });
@@ -368,8 +371,9 @@ async function runEpisode() {
368
  updateUI(s.observation, s.reward, s.info);
369
 
370
  const actionLabel = s.action + (s.action_value !== null ? `(${s.action_value})` : "");
 
371
  addLog(
372
- `Step ${s.info.step_count}: ${actionLabel} -> R=${s.reward >= 0 ? "+" : ""}${fmt(s.reward, 3)} [${s.info.outcome}]`,
373
  s.reward >= 0 ? "reward" : "negative"
374
  );
375
 
@@ -379,8 +383,10 @@ async function runEpisode() {
379
 
380
  const outcome = data.final_outcome;
381
  setEpisodeStatus(outcome === "complete" ? "DONE" : "DROPPED", outcome);
 
 
382
  addLog(
383
- `Episode complete: ${outcome} | Total reward: ${fmt(state.totalReward, 3)} | Steps: ${data.total_steps}`,
384
  "outcome"
385
  );
386
 
 
280
  dom.metricOutcome.textContent = "--";
281
  dom.metricOutcome.className = "text-lg font-bold text-dark-400";
282
 
283
+ addLog(`[START] task=default env=ui_layout_optimizer model=${dom.agentSelect.value}`, "system");
284
  } catch (err) {
285
  addLog("Error: " + err.message, "negative");
286
  }
 
319
  state.done = s.done;
320
 
321
  updateUI(s.observation, s.reward, s.info);
322
+ const errorStr = s.info.error ? s.info.error : "null";
323
  addLog(
324
+ `[STEP] step=${s.info.step_count} action=${s.action} reward=${fmt(s.reward, 2)} done=${s.done} error=${errorStr}`,
325
  s.reward >= 0 ? "reward" : "negative"
326
  );
327
 
 
331
  const outcome = s.info.outcome;
332
  setEpisodeStatus(outcome === "complete" ? "DONE" : "DROPPED", outcome);
333
  setControlsEnabled(false);
334
+ const success = outcome === "complete" ? "true" : "false";
335
+ const rewardsStr = state._cachedSteps.slice(0, state._cacheIdx).map(st => fmt(st.reward, 2)).join(",");
336
+ addLog(`[END] success=${success} steps=${s.info.step_count} rewards=${rewardsStr}`, "outcome");
337
  state._cachedSteps = null;
338
  }
339
  }
 
355
  dom.btnRun.textContent = "Running...";
356
  setControlsEnabled(false);
357
 
358
+ addLog(`[START] task=default env=ui_layout_optimizer model=${agent}`, "system");
359
 
360
  try {
361
  const data = await api("/run_episode", "POST", { agent });
 
371
  updateUI(s.observation, s.reward, s.info);
372
 
373
  const actionLabel = s.action + (s.action_value !== null ? `(${s.action_value})` : "");
374
+ const errorStr = s.info.error ? s.info.error : "null";
375
  addLog(
376
+ `[STEP] step=${s.info.step_count} action=${actionLabel} reward=${fmt(s.reward, 2)} done=${s.done} error=${errorStr}`,
377
  s.reward >= 0 ? "reward" : "negative"
378
  );
379
 
 
383
 
384
  const outcome = data.final_outcome;
385
  setEpisodeStatus(outcome === "complete" ? "DONE" : "DROPPED", outcome);
386
+ const success = outcome === "complete" ? "true" : "false";
387
+ const rewardsStr = data.steps.map(st => fmt(st.reward, 2)).join(",");
388
  addLog(
389
+ `[END] success=${success} steps=${data.total_steps} rewards=${rewardsStr}`,
390
  "outcome"
391
  );
392
 
inference.py CHANGED
@@ -23,10 +23,10 @@ def log_step(step: int, action: str, reward: float, done: bool, error: Optional[
23
  flush=True,
24
  )
25
 
26
- def log_end(success: bool, steps: int, score: float, rewards: List[float]) -> None:
27
  rewards_str = ",".join(f"{r:.2f}" for r in rewards)
28
  success_val = str(success).lower()
29
- print(f"[END] success={success_val} steps={steps} score={score:.3f} rewards={rewards_str}", flush=True)
30
 
31
  def run_inference(task_id: str = "easy") -> None:
32
  """
@@ -89,8 +89,7 @@ def run_inference(task_id: str = "easy") -> None:
89
 
90
  # Enforce strict (0,1) bound
91
  score = clamp_score(score)
92
-
93
- log_end(success=completed, steps=step_count, score=score, rewards=rewards)
94
 
95
  if __name__ == "__main__":
96
  parser = argparse.ArgumentParser(description="Run UIEnv Inference")
 
23
  flush=True,
24
  )
25
 
26
+ def log_end(success: bool, steps: int, rewards: List[float]) -> None:
27
  rewards_str = ",".join(f"{r:.2f}" for r in rewards)
28
  success_val = str(success).lower()
29
+ print(f"[END] success={success_val} steps={steps} rewards={rewards_str}", flush=True)
30
 
31
  def run_inference(task_id: str = "easy") -> None:
32
  """
 
89
 
90
  # Enforce strict (0,1) bound
91
  score = clamp_score(score)
92
+ log_end(success=completed, steps=step_count, rewards=rewards)
 
93
 
94
  if __name__ == "__main__":
95
  parser = argparse.ArgumentParser(description="Run UIEnv Inference")