Spaces:
Sleeping
Sleeping
Upload folder using huggingface_hub
Browse files- BUILD_NOTES.md +18 -0
- Dockerfile +3 -1
- README.md +268 -500
- hars/forum.har +0 -0
- hars/shopping.har +2 -2
- hars/shopping_admin.har +2 -2
- hars/wikipedia.har +0 -0
- inference.py +433 -164
- openenv_harvestgym.egg-info/PKG-INFO +2 -0
- openenv_harvestgym.egg-info/SOURCES.txt +1 -0
- openenv_harvestgym.egg-info/requires.txt +2 -0
- parameter_pools.json +123 -519
- pyproject.toml +2 -0
- scripts/inspect_har_endpoints.py +240 -0
- server/judge.py +95 -12
- server/models.py +125 -31
- server/tools/browser_agent.py +104 -112
- server/tools/curl_exec.py +53 -16
- server/tools/html_distiller.py +485 -0
- server/tools/search_episode_data.py +290 -57
- uv.lock +150 -0
BUILD_NOTES.md
CHANGED
|
@@ -192,6 +192,24 @@ When in doubt: check the endpoint schema returned by search_endpoints() — it s
|
|
| 192 |
|
| 193 |
---
|
| 194 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 195 |
## Non-Issues (Resolved in Design)
|
| 196 |
|
| 197 |
- ~~`store_finding` / `get_findings` tools~~ — **Removed**. Value threading happens through episode `history`.
|
|
|
|
| 192 |
|
| 193 |
---
|
| 194 |
|
| 195 |
+
### 10. HAR Is the Agent's Only API Knowledge Source — No Catalog Fallback
|
| 196 |
+
|
| 197 |
+
**Status:** Design decision, locked
|
| 198 |
+
**Detail:** The `browser_agent` tool uses **only the HAR file** to build the agent's endpoint index and embeddings. The API catalogs (`catalogs/*.json`) are used exclusively by the judge for parameter-sourcing grading — they play no role in the training loop.
|
| 199 |
+
|
| 200 |
+
If a HAR yields very few endpoints, **the HAR recording needs to be improved**, not the code. The product does not patch sparse recordings by injecting catalog data into the agent's search corpus. This is intentional: the RL challenge is for the agent to discover and use APIs it has actually observed, not a curated ground-truth list.
|
| 201 |
+
|
| 202 |
+
**What goes where:**
|
| 203 |
+
|
| 204 |
+
| Data source | Who uses it | How |
|
| 205 |
+
|---|---|---|
|
| 206 |
+
| `hars/*.har` | Agent only | `browser_agent` → `search_endpoints` semantic search |
|
| 207 |
+
| `catalogs/*.json` | Judge only | Parameter-sourcing grading (`judge.py`) |
|
| 208 |
+
|
| 209 |
+
**Do not add catalog augmentation back** to `browser_agent.py` or `search_endpoints.py` under any circumstances. If the embed cache shows a large number of entries (e.g. 503 instead of 1), it means catalog entries leaked into the agent — clear the cache and fix the source.
|
| 210 |
+
|
| 211 |
+
---
|
| 212 |
+
|
| 213 |
## Non-Issues (Resolved in Design)
|
| 214 |
|
| 215 |
- ~~`store_finding` / `get_findings` tools~~ — **Removed**. Value threading happens through episode `history`.
|
Dockerfile
CHANGED
|
@@ -67,6 +67,9 @@ COPY --from=builder /app/env /app/env
|
|
| 67 |
# Set PATH to use the virtual environment
|
| 68 |
ENV PATH="/app/.venv/bin:$PATH"
|
| 69 |
|
|
|
|
|
|
|
|
|
|
| 70 |
# Set PYTHONPATH so imports work correctly
|
| 71 |
ENV PYTHONPATH="/app/env:$PYTHONPATH"
|
| 72 |
|
|
@@ -76,5 +79,4 @@ HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
|
|
| 76 |
|
| 77 |
# Run the FastAPI server
|
| 78 |
# The module path is constructed to work with the /app/env structure
|
| 79 |
-
ENV ENABLE_WEB_INTERFACE=true
|
| 80 |
CMD ["sh", "-c", "cd /app/env && uvicorn server.app:app --host 0.0.0.0 --port 8000"]
|
|
|
|
| 67 |
# Set PATH to use the virtual environment
|
| 68 |
ENV PATH="/app/.venv/bin:$PATH"
|
| 69 |
|
| 70 |
+
# Enable Gradio web UI for manual testing
|
| 71 |
+
ENV ENABLE_WEB_INTERFACE=true
|
| 72 |
+
|
| 73 |
# Set PYTHONPATH so imports work correctly
|
| 74 |
ENV PYTHONPATH="/app/env:$PYTHONPATH"
|
| 75 |
|
|
|
|
| 79 |
|
| 80 |
# Run the FastAPI server
|
| 81 |
# The module path is constructed to work with the /app/env structure
|
|
|
|
| 82 |
CMD ["sh", "-c", "cd /app/env && uvicorn server.app:app --host 0.0.0.0 --port 8000"]
|
README.md
CHANGED
|
@@ -15,38 +15,47 @@ base_path: /web
|
|
| 15 |
|
| 16 |
# HARvestGym
|
| 17 |
|
| 18 |
-
|
| 19 |
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
Web applications are full of APIs. Every click in a browser triggers an HTTP call with a precise schema, a specific authentication header, an exact sequence of prerequisites. **HARvestGym trains a small model to do all of that directly** — given a task and a URL, it discovers the relevant endpoints, understands what each one needs, chains the calls in the right order, and completes the task without any browser.
|
| 23 |
|
| 24 |
The model starts with nothing: no schema, no documentation, no endpoint list. It uses tools to explore — issuing requests, inspecting responses, building up its own understanding of how the application works. This is what a developer does when they reverse-engineer an API. The model learns to do the same.
|
| 25 |
|
| 26 |
-
|
| 27 |
|
| 28 |
-
##
|
| 29 |
|
| 30 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
|
| 32 |
-
The
|
| 33 |
|
| 34 |
- which of those endpoints are actually needed for this specific task
|
| 35 |
- in what order they must be called (you cannot add to a cart before the cart exists)
|
| 36 |
- where each required parameter value comes from
|
| 37 |
- how to re-authenticate if a session expires mid-episode
|
| 38 |
|
| 39 |
-
The model must learn to
|
| 40 |
-
|
| 41 |
-
1. **Discover endpoints** — by using a browser agent tool that completes the same task in a real browser while recording all network traffic, then filtering that traffic to extract only the meaningful application API calls (stripping out CDN requests, analytics, static assets). The browser agent runs once and generates the raw discovery data; the model uses this as its starting context.
|
| 42 |
-
2. **Select the right endpoints** — from the browser agent's list, identify the subset relevant to the current task (not every observed endpoint is needed)
|
| 43 |
-
3. **Sequence calls correctly** — determine the prerequisite order (create cart → find product → add item), including calls that must happen before others even though the task description doesn't say so
|
| 44 |
-
4. **Thread parameters** — this is the hardest part. APIs form a dependency graph:
|
| 45 |
-
- Some values come from a previous response (`cart_id` from step 1 → path param in step 3)
|
| 46 |
-
- Some values come from the authentication flow (`form_key`, `Bearer token` → header in every subsequent call)
|
| 47 |
-
- Some values come from the task description (`product name` → search query → `sku` → body of add-item call)
|
| 48 |
-
- The ground truth catalog defines these relationships precisely; the model learns to navigate them
|
| 49 |
-
5. **Handle auth and errors** — detect 401 / session-expired responses, re-authenticate, and continue; interpret 4xx errors and adjust the next call accordingly
|
| 50 |
|
| 51 |
---
|
| 52 |
|
|
@@ -61,275 +70,138 @@ The model must learn to:
|
|
| 61 |
│ ▼ │
|
| 62 |
│ ┌────────────────────────────────────────────────────────────────┐ │
|
| 63 |
│ │ Policy Model (RL Agent) │ │
|
| 64 |
-
│ │ small model — no prior knowledge of the app
|
| 65 |
│ │ │ │
|
| 66 |
-
│ │ Observation: task + history + session_state + last_result
|
| 67 |
│ │ │ │
|
| 68 |
-
│ │ Step 1 ──► browser_agent(task, url)
|
| 69 |
-
│ │ Step 2+ ──► search_endpoints(query)
|
| 70 |
-
│ │ ──► curl_exec(command)
|
| 71 |
-
│ │ ──► search_episode_data(query)
|
| 72 |
-
│ │ ──► done(result)
|
| 73 |
│ └────────┬───────────────────────────────────────────────────────┘ │
|
| 74 |
│ │ │
|
| 75 |
-
│ ┌──────┴──────────────────────────────┐
|
| 76 |
-
│ │ │
|
| 77 |
-
│ ▼ ▼
|
| 78 |
-
│ ┌─────────────────────┐ ┌─────────────────────────────────────┐
|
| 79 |
-
│ │ Browser Agent │ │ Environment │
|
| 80 |
-
│ │ (step 1 only) │ │ │
|
| 81 |
-
│ │ │ │ • Executes curl_exec via subprocess│
|
| 82 |
-
│ │ Training: │ │ • Auto-injects session cookies │
|
| 83 |
-
│ │ Load pre-recorded │ │ • Smart-truncates response bodies │
|
| 84 |
-
│ │ cached HAR from │ │ • Indexes full responses into │
|
| 85 |
-
│ │ disk or launch │ │ per-episode BM25 + GEMMA store │
|
| 86 |
-
│ │ on real browser │ │ • Manages session_state: cookies, │
|
| 87 |
-
│ │ │ │ CSRF tokens, auth headers │
|
| 88 |
-
│ │ Inference: │
|
| 89 |
-
│ │ Launch real browser│ │
|
| 90 |
-
│ │ via Playwright + │ │ HTTP calls (always live)
|
| 91 |
-
│ │ bu-30b-a3b-preview │ ▼
|
| 92 |
-
│ │ │ ┌─────────────────────────────────────┐
|
| 93 |
-
│ │ Both paths produce: │ │ WebArena EC2 (live apps) │
|
| 94 |
-
│ │ • Filtered HAR │ │ │
|
| 95 |
-
│ │ • OpenAPI-like spec│ │ :7770 Shopping (Magento 2) │
|
| 96 |
-
│ │ • GEMMA embeddings │ │ :7780 Shopping Admin │
|
| 97 |
-
│ │ for search_ │ │ :9999 Forum (Postmill) │
|
| 98 |
-
│ │ endpoints() │ │ :8888 Wikipedia (Kiwix)
|
| 99 |
-
│ └─────────────────────┘ │ :3000 Map (OpenStreetMap)
|
| 100 |
-
│ └──────────────┬──────────────────────┘
|
| 101 |
-
│ │
|
| 102 |
-
│ │ episode trajectory
|
| 103 |
-
│ ▼
|
| 104 |
-
│ ┌─────────────────────────────────────┐
|
| 105 |
-
│ │ Deterministic Judge │
|
| 106 |
-
│ │ │
|
| 107 |
-
│ │ Per-template programmatic grader: │
|
| 108 |
-
│ │ • Inspects episode trajectory │
|
| 109 |
-
│ │ • Optionally probes live app state │
|
| 110 |
-
│ │ • Verifies parameter sourcing │
|
| 111 |
-
│ │ (TASK_SPEC / PREV_CALL / │
|
| 112 |
-
│ │ AUTH_FLOW / STATIC / DERIVED)
|
| 113 |
-
│ │ • Scores [0.0 → 1.0]
|
| 114 |
-
│ └──────────────┬──────────────────────┘
|
| 115 |
-
│ │
|
| 116 |
-
│ ▼
|
| 117 |
-
│ ┌─────────────────────────────────────┐
|
| 118 |
-
│ │ Reward Signal │
|
| 119 |
-
│ │ │
|
| 120 |
-
│ │ Per-step: │
|
| 121 |
-
│ │ +0.2 valid API call (2xx) │
|
| 122 |
-
│ │ +0.1 new path explored │
|
| 123 |
-
│ │ +0.25 correct param sourcing │
|
| 124 |
-
│ │ −0.15 repeated identical call │
|
| 125 |
-
│ │ −0.3 browser_agent called again │
|
| 126 |
-
│ │ │
|
| 127 |
-
│ │ Episode end: │
|
| 128 |
-
│ │ +2.0–+5.0 task complete (easy→hard│
|
| 129 |
-
│ │ −1.5 task failed │
|
| 130 |
-
│ └──────────────┬──────────────────────┘
|
| 131 |
-
│ │
|
| 132 |
-
│ ▼
|
| 133 |
-
│ ┌─────────────────────────────────────┐
|
| 134 |
-
│ │ GRPO (via HF TRL) │
|
| 135 |
-
│ │ │
|
| 136 |
-
│ │ 8 parallel rollouts per prompt │
|
| 137 |
-
│ │ Computes advantages without │
|
| 138 |
-
│ │ a value function │
|
| 139 |
-
│ │ Updates policy weights │
|
| 140 |
-
│ └────────────────────────────────────
|
| 141 |
-
│ │
|
| 142 |
-
│ └──► updated Policy Model
|
| 143 |
└─────────────────────────────────────────────────────────────────────────┘
|
| 144 |
```
|
| 145 |
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
```
|
| 149 |
-
HAR File (cached using Browser Agent) ──► filter_har_entries()
|
| 150 |
-
│
|
| 151 |
-
▼
|
| 152 |
-
drop: CDN, analytics, static assets
|
| 153 |
-
keep: {method, path, request_body,
|
| 154 |
-
response_body, status_code}
|
| 155 |
-
│
|
| 156 |
-
▼
|
| 157 |
-
extract_openapi_spec()
|
| 158 |
-
→ structured endpoint catalog
|
| 159 |
-
{path, method, params, auth, response_fields}
|
| 160 |
-
│
|
| 161 |
-
┌──────┴──────┐
|
| 162 |
-
│ │
|
| 163 |
-
▼ ▼
|
| 164 |
-
build_GEMMA_embeddings return summary list
|
| 165 |
-
(search_endpoints to RL agent:
|
| 166 |
-
index — full schemas) [GET /products,
|
| 167 |
-
POST /guest-carts, ...]
|
| 168 |
-
│
|
| 169 |
-
▼
|
| 170 |
-
search_endpoints("create guest cart")
|
| 171 |
-
→ top-3 endpoint schemas with:
|
| 172 |
-
• path params + sources
|
| 173 |
-
• body params + sources
|
| 174 |
-
• auth requirements
|
| 175 |
-
• response field names
|
| 176 |
-
```
|
| 177 |
-
|
| 178 |
-
### Episode Response Indexing
|
| 179 |
-
|
| 180 |
-
```
|
| 181 |
-
curl_exec(command)
|
| 182 |
-
│
|
| 183 |
-
├──► subprocess: execute against live EC2
|
| 184 |
-
│
|
| 185 |
-
├──► index_full_response()
|
| 186 |
-
│ BM25 index ── keyword match (IDs, SKUs, tokens)
|
| 187 |
-
│ GEMMA embed ── semantic match (paraphrases)
|
| 188 |
-
│ (indexes BEFORE truncation — all items stored)
|
| 189 |
-
│
|
| 190 |
-
└──► smart_truncate()
|
| 191 |
-
non-JSON HTML → 3,000 chars
|
| 192 |
-
JSON primitive → never truncated
|
| 193 |
-
error (4xx/5xx) → never truncated
|
| 194 |
-
small JSON → returned as-is
|
| 195 |
-
large array → first 2 items shown
|
| 196 |
-
+ _list_truncated annotation
|
| 197 |
-
+ hint to call search_episode_data()
|
| 198 |
-
```
|
| 199 |
-
|
| 200 |
-
### Parameter Dependency Graph (what the judge tracks)
|
| 201 |
-
|
| 202 |
-
```
|
| 203 |
-
Task: "Add 'Radiant Tee' to a guest cart"
|
| 204 |
-
|
| 205 |
-
┌─────────────────────────────────────────────────────────┐
|
| 206 |
-
│ TASK_SPEC ──────────────────────────────────────────┐ │
|
| 207 |
-
│ "Radiant Tee" (product name) │ │
|
| 208 |
-
│ │ │ │
|
| 209 |
-
│ ▼ │ │
|
| 210 |
-
│ GET /rest/V1/products?name=Radiant+Tee │ │
|
| 211 |
-
│ → items[0].sku = "MH01" (PREV_CALL) ──┐ │ │
|
| 212 |
-
│ │ │ │
|
| 213 |
-
│ POST /rest/V1/guest-carts │ │ │
|
| 214 |
-
│ → body = "cart-abc123" (PREV_CALL) ──┼──┼─►│
|
| 215 |
-
│ │ │ │
|
| 216 |
-
│ POST /rest/V1/guest-carts/{cartId}/items │ │ │
|
| 217 |
-
│ path: cartId ◄────── "cart-abc123" ───────┘ │ │
|
| 218 |
-
│ body: sku ◄────── "MH01" ─────────┘ │
|
| 219 |
-
│ body: qty ◄────── TASK_SPEC (quantity) │
|
| 220 |
-
│ body: quote_id ◄────── DERIVED (= cartId) │
|
| 221 |
-
└─────────────────────────────────────────────────────────┘
|
| 222 |
-
|
| 223 |
-
Source types tracked by the judge:
|
| 224 |
-
TASK_SPEC — value stated in the task string
|
| 225 |
-
PREV_CALL — value from a prior curl response in this episode
|
| 226 |
-
AUTH_FLOW — value from a session/token auth step
|
| 227 |
-
STATIC — fixed application constant (e.g. store_id = 1)
|
| 228 |
-
DERIVED — computed from another param (e.g. quote_id = cart_id)
|
| 229 |
-
```
|
| 230 |
-
|
| 231 |
-
### Curriculum: Complexity Tiers
|
| 232 |
-
|
| 233 |
-
```
|
| 234 |
-
Easy ──────────────────────── graduate when P(success) > 0.7
|
| 235 |
-
│ Single call, no auth │
|
| 236 |
-
│ Templates 1, 2 │
|
| 237 |
-
│ 1 API call required │
|
| 238 |
-
│ ▼
|
| 239 |
-
Medium ──────────────────────── graduate when P(success) > 0.7
|
| 240 |
-
│ Auth + 1–2 dependent calls │
|
| 241 |
-
│ Templates 3, 4 │
|
| 242 |
-
│ 2–3 API calls required │
|
| 243 |
-
│ ▼
|
| 244 |
-
Hard ────────────────────────── final tier
|
| 245 |
-
Multi-step chain, full auth, ID threading
|
| 246 |
-
Templates 5, 6, 7
|
| 247 |
-
4–8+ API calls required
|
| 248 |
-
Reward scaling: ×2.5 vs Easy
|
| 249 |
-
```
|
| 250 |
-
|
| 251 |
-
### The RL Agent's Tool: Browser Agent
|
| 252 |
-
|
| 253 |
-
The RL agent has access to a **browser agent tool** powered by `[browser-use/bu-30b-a3b-preview](https://huggingface.co/browser-use/bu-30b-a3b-preview)` — a 30B MoE vision-language model (3B active parameters) purpose-built for web task completion, served via the [browser-use](https://github.com/browser-use/browser-use) library on Playwright. When the RL agent calls this tool with a natural language task, the browser agent:
|
| 254 |
-
|
| 255 |
-
1. Opens the target application in a real browser
|
| 256 |
-
2. Completes the task by clicking, typing, and navigating — exactly as a human would
|
| 257 |
-
3. All HTTP traffic is intercepted via Playwright network events
|
| 258 |
-
4. Returns the intercepted traffic, filtered down to only the application's own API calls
|
| 259 |
-
|
| 260 |
-
The filtering step strips analytics pings, CDN requests, font loads, JS/CSS bundles and returns only `{method, path, request_body, response_body, status_code}` tuples for the app's actual API endpoints.
|
| 261 |
-
|
| 262 |
-
**Training vs. inference — what gets cached:**
|
| 263 |
-
|
| 264 |
-
- The browser agent output (filtered endpoint list) is pre-computed once per task and cached. During training, the RL model receives this cached result instantly — no live browser session runs.
|
| 265 |
-
- The RL agent's own `curl_exec` calls **always hit the real live WebArena server** — during both training and inference. No API response is mocked or cached.
|
| 266 |
-
- At inference, the browser agent runs live to handle novel tasks or changed application state.
|
| 267 |
-
|
| 268 |
-
Full architecture and code: `[BROWSER_AGENT.md](BROWSER_AGENT.md)`
|
| 269 |
-
|
| 270 |
-
### Ground Truth: From the Codebase, Not the Browser
|
| 271 |
|
| 272 |
-
|
| 273 |
|
| 274 |
-
|
| 275 |
|
| 276 |
-
|
| 277 |
-
|
| 278 |
-
|
| 279 |
-
|
| 280 |
-
|
| 281 |
-
|
| 282 |
-
|
| 283 |
-
cartItem.sku: [string] the product's SKU, from: GET /rest/V1/products → items[].sku
|
| 284 |
-
cartItem.qty: [number] quantity, from: task specification
|
| 285 |
-
cartItem.quote_id: [string] same as cartId
|
| 286 |
-
```
|
| 287 |
|
| 288 |
-
|
| 289 |
|
| 290 |
-
|
| 291 |
|
| 292 |
-
##
|
| 293 |
|
| 294 |
-
```
|
| 295 |
-
Task (natural language) + App URL
|
| 296 |
-
│
|
| 297 |
-
▼
|
| 298 |
-
Policy Model (sees: task + history of all prior actions/results + session_state + findings)
|
| 299 |
-
│ calls tools to explore and execute
|
| 300 |
-
├─► browser_agent(task, url) → filtered API call list (cached during training)
|
| 301 |
-
├─► search_endpoints(query) → full schema for a specific endpoint
|
| 302 |
-
├─► curl_exec(command) → execute HTTP call, get {status, headers, body}
|
| 303 |
-
├─► search_episode_data(q) → search prior response bodies in this episode
|
| 304 |
-
└─► done(result) → declare task complete
|
| 305 |
-
│
|
| 306 |
-
▼
|
| 307 |
-
Live WebArena App (EC2) ←─── real HTTP responses (always live, never mocked)
|
| 308 |
-
│
|
| 309 |
-
▼
|
| 310 |
-
Judge (compares against ground truth API catalog)
|
| 311 |
-
│
|
| 312 |
-
▼
|
| 313 |
-
Reward Signal ──► GRPO ──► updated policy
|
| 314 |
-
```
|
| 315 |
|
| 316 |
-
|
| 317 |
|
| 318 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 319 |
|
| 320 |
-
|
| 321 |
|
|
|
|
| 322 |
|
| 323 |
-
|
|
| 324 |
-
| ---
|
| 325 |
-
| Shopping |
|
| 326 |
-
|
|
| 327 |
-
|
|
| 328 |
-
|
|
| 329 |
-
|
|
|
|
|
|
|
|
| 330 |
|
|
|
|
| 331 |
|
| 332 |
-
|
|
|
|
|
|
|
|
|
|
| 333 |
|
| 334 |
---
|
| 335 |
|
|
@@ -343,170 +215,57 @@ What the model sees at each step:
|
|
| 343 |
class Observation(BaseModel):
|
| 344 |
task: str # Natural language task
|
| 345 |
app_base_url: str # Root URL of the target application
|
| 346 |
-
last_tool_result: Any # Result of last tool call
|
| 347 |
-
|
| 348 |
-
|
| 349 |
-
# search_episode_data → list of matching JSON object strings
|
| 350 |
-
history: list[dict] # Full episode trajectory: list of {action, tool_result} pairs
|
| 351 |
-
# from all prior steps. The model sees what it already tried,
|
| 352 |
-
# enabling value threading (read a cart_id from step 2's response
|
| 353 |
-
# and use it in step 5's curl call) and loop avoidance.
|
| 354 |
-
session_state: dict # Auto-managed by environment: cookies, tokens, CSRF values
|
| 355 |
-
# extracted from all prior HTTP Set-Cookie and response bodies
|
| 356 |
-
# e.g. {"PHPSESSID": "abc", "form_key": "xyz", "cart_id": "123"}
|
| 357 |
step_count: int
|
| 358 |
-
max_steps: int # 20
|
| 359 |
-
```
|
| 360 |
-
|
| 361 |
-
`session_state` is maintained by the environment. The model never parses `Set-Cookie` headers — the environment extracts tokens automatically and makes them available. The model decides *when* to authenticate and *which* session values to use; the environment handles *extraction*.
|
| 362 |
-
|
| 363 |
-
**curl execution:** The agent outputs a curl command string. The environment parses it and executes it via subprocess against the live EC2 server — the agent machine never has a direct network connection to WebArena. The environment also injects cookies from `session_state` automatically before each call.
|
| 364 |
-
|
| 365 |
-
**Response truncation — smart array truncation, not byte cutoff:** HTTP response bodies are processed by a pure Python function before being returned to the model. Rules applied in order:
|
| 366 |
-
|
| 367 |
-
1. **Non-JSON body** (HTML, CSS, JS, plain text): truncate to 3,000 characters. HTML from form-serving pages (login, post creation) is kept longer than pure prose because CSRF tokens and `<input>` fields are embedded inside the markup and the model needs to locate them. See the [HTML / Form-Submission Handling](#html--form-submission-handling) section below for how the model is expected to work with HTML responses.
|
| 368 |
-
2. **JSON primitive** (string, number, boolean): never truncated — these are tokens, IDs, confirmations.
|
| 369 |
-
3. **Error response (4xx / 5xx)**: never truncated — the model needs every word to self-correct.
|
| 370 |
-
4. **JSON object or array with no large arrays** (< 3 dict items per array): returned as-is.
|
| 371 |
-
5. **JSON with a large array field** (≥ 3 dict items): keep first 2 items, drop the rest, and add a `_list_truncated` annotation:
|
| 372 |
-
|
| 373 |
-
```json
|
| 374 |
-
{
|
| 375 |
-
"items": [
|
| 376 |
-
{"sku": "MH01", "name": "Radiant Tee", "price": 22.0},
|
| 377 |
-
{"sku": "MH02", "name": "Breathe-Easy Tank", "price": 34.0}
|
| 378 |
-
],
|
| 379 |
-
"_list_truncated": {
|
| 380 |
-
"field": "items",
|
| 381 |
-
"shown": 2,
|
| 382 |
-
"total": 50,
|
| 383 |
-
"note": "Showing 2 of 50 items. Use search_episode_data() to find a specific item from this response."
|
| 384 |
-
}
|
| 385 |
-
}
|
| 386 |
-
```
|
| 387 |
-
|
| 388 |
-
**Episode response indexing:** Every `curl_exec` call indexes the full request and response bodies into a per-episode hybrid index (BM25 for keyword matching + GEMMA semantic embeddings for paraphrase handling). When a list is truncated, all items (not just the 2 shown) are indexed. The model can retrieve any specific object using `search_episode_data("keyword or natural language query")` without needing a filtered API endpoint to exist. See `TOOLS.md` for the full indexing algorithm.
|
| 389 |
-
|
| 390 |
-
### Action Space
|
| 391 |
-
|
| 392 |
-
The model outputs a single tool call per step. Full technical specifications for all tools (document construction, truncation implementation, index architecture, caveats) are in `[TOOLS.md](./TOOLS.md)`.
|
| 393 |
-
|
| 394 |
-
|
| 395 |
-
| Tool | Input | What It Does | Output |
|
| 396 |
-
| ---------------------------- | --------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ |
|
| 397 |
-
| `browser_agent(task, url)` | Task string + app base URL | Checks for pre-recorded HAR; if found, processes it — otherwise launches live browser to perform task and record traffic. Extracts OpenAPI-like spec, builds GEMMA embeddings for search. | Summary list of API endpoint names + methods (e.g. `GET /products`). No schemas/headers. Use `search_endpoints()` for details. |
|
| 398 |
-
| `search_endpoints(query)` | Natural language query | Semantic search over GEMMA-embedded endpoint spec built by `browser_agent`. Returns full parameter details for matching endpoints. | Top-3 endpoint schemas (method, path, auth, params with sources, response fields) |
|
| 399 |
-
| `curl_exec(command)` | Full curl command string | Executes HTTP call against live EC2 server, indexes full response into episode BM25 store, returns truncated observation. | `{status_code, headers, body}` — body smart-truncated; full body indexed to episode store |
|
| 400 |
-
| `search_episode_data(query)` | Keyword or natural language query | Hybrid BM25 + GEMMA semantic search over all request/response bodies from prior `curl_exec` calls in this episode. | Top-5 JSON objects from this episode's request/response history |
|
| 401 |
-
| `done(result?)` | Optional result string | Signals task complete, triggers judge evaluation. | Ends episode |
|
| 402 |
-
|
| 403 |
-
|
| 404 |
-
`browser_agent` is called **exactly once per episode at step 1**. During training, it loads a cached pre-recorded HAR file(if available); at inference, it will launch a live browser session. It returns the deduplicated list of API endpoint patterns observed in the network traffic. **If called again after step 1, the call executes normally but a −0.3 penalty is applied to the reward.** `search_endpoints` then provides the full schema for any specific endpoint the model wants to call — searching the GEMMA embeddings built by `browser_agent` from the HAR data.
|
| 405 |
-
|
| 406 |
-
`curl_exec` is the primary HTTP action — one string that encodes method, URL, headers, and body together, exactly as API documentation is written. This lets the model leverage its pretrained knowledge of `curl` syntax while producing calls that are self-documenting.
|
| 407 |
-
|
| 408 |
-
```bash
|
| 409 |
-
# Step 1 — Discover which endpoint creates a guest cart
|
| 410 |
-
# (model calls search_endpoints first, sees: POST /rest/V1/guest-carts)
|
| 411 |
-
|
| 412 |
-
# Step 2 — Create guest cart
|
| 413 |
-
curl -X POST 'http://ec2-.../rest/V1/guest-carts' -H 'Content-Type: application/json'
|
| 414 |
-
# → body: "cart-abc123" (plain string — never truncated)
|
| 415 |
-
|
| 416 |
-
# Step 3 — Find the product SKU (list response, truncated to 2 items + note)
|
| 417 |
-
curl 'http://ec2-.../rest/V1/products?searchCriteria[filter_groups][0][filters][0][field]=name&searchCriteria[filter_groups][0][filters][0][value]=Radiant+Tee'
|
| 418 |
-
# → body: {"items":[{"sku":"MH01","name":"Radiant Tee","price":22.0}],"total_count":1}
|
| 419 |
-
# (1 item — not truncated; if 200 items, all 200 indexed, 2 shown in context)
|
| 420 |
-
|
| 421 |
-
# Step 4 — Add item (model reads cart-abc123 from step 2, MH01 from step 3 — all in history)
|
| 422 |
-
curl -X POST 'http://ec2-.../rest/V1/guest-carts/cart-abc123/items' \
|
| 423 |
-
-H 'Content-Type: application/json' \
|
| 424 |
-
-d '{"cartItem":{"sku":"MH01","qty":1,"quote_id":"cart-abc123"}}'
|
| 425 |
-
```
|
| 426 |
-
|
| 427 |
-
Values from prior responses (cart IDs, SKUs, tokens) are threaded directly from the growing episode history. `session_state` tokens (cookies, CSRF values) are auto-injected by the environment. If a list response was truncated and the model needs a specific item not shown in the 2-item sample, it calls `search_episode_data("Radiant Tee sku")` — all 200 items are indexed, even though only 2 were shown in context.
|
| 428 |
-
|
| 429 |
-
### Prompt Structure:
|
| 430 |
-
|
| 431 |
```
|
| 432 |
-
SYSTEM: You are an API agent. Complete the task using only the tools available:
|
| 433 |
-
browser_agent, search_endpoints, curl_exec, search_episode_data, done.
|
| 434 |
-
When a response is HTML, look for JSON data embedded in <script> tags or
|
| 435 |
-
extract values from <input> fields. CSRF tokens appear as hidden inputs:
|
| 436 |
-
<input type="hidden" name="_csrf_token" value="XYZ">
|
| 437 |
-
|
| 438 |
-
TASK: Add "Radiant Tee" to a guest cart at http://ec2-16-59-2-56.../
|
| 439 |
-
|
| 440 |
-
[session_state: {}]
|
| 441 |
-
|
| 442 |
-
STEP 1 ACTION: browser_agent("Add Radiant Tee to a guest cart", "http://ec2-...:7770/")
|
| 443 |
-
STEP 1 RESULT: {"app": "shopping", "endpoints": [
|
| 444 |
-
"POST /rest/V1/guest-carts",
|
| 445 |
-
"GET /rest/V1/products",
|
| 446 |
-
"POST /rest/V1/guest-carts/{id}/items",
|
| 447 |
-
...
|
| 448 |
-
], "note": "Use search_endpoints() to get full schema for any of these."}
|
| 449 |
-
|
| 450 |
-
STEP 2 ACTION: search_endpoints("create guest cart")
|
| 451 |
-
STEP 2 RESULT: ["endpoint: POST /rest/V1/guest-carts | auth: none | returns: string (cartId)", ...]
|
| 452 |
-
|
| 453 |
-
STEP 3 ACTION: curl_exec("curl -X POST 'http://ec2-.../rest/V1/guest-carts' -H 'Content-Type: application/json'")
|
| 454 |
-
STEP 3 RESULT: {status_code: 200, body: "cart-abc123"}
|
| 455 |
|
| 456 |
-
|
| 457 |
-
STEP 4 RESULT: ["endpoint: GET /rest/V1/products | query: searchCriteria filters | returns: .items[].sku .items[].name", ...]
|
| 458 |
|
| 459 |
-
|
| 460 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 461 |
|
| 462 |
-
|
| 463 |
-
STEP 6 RESULT: ["endpoint: POST /rest/V1/guest-carts/{cartId}/items | path: cartId from POST /rest/V1/guest-carts | body: cartItem.sku, cartItem.qty, cartItem.quote_id (same as cartId)", ...]
|
| 464 |
|
| 465 |
-
|
| 466 |
-
STEP 7 RESULT: {status_code: 200, body: {"item_id": 5, "sku": "MH01", "qty": 1}}
|
| 467 |
-
|
| 468 |
-
→ generate STEP 8: done("Radiant Tee added to cart")
|
| 469 |
-
```
|
| 470 |
|
| 471 |
-
|
| 472 |
|
| 473 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 474 |
|
| 475 |
-
|
| 476 |
|
| 477 |
-
``
|
| 478 |
-
Parameter Source Types:
|
| 479 |
-
TASK_SPEC — value given directly in the task (e.g., "product #42")
|
| 480 |
-
PREV_CALL — value from a prior API response in this episode
|
| 481 |
-
AUTH_FLOW — value obtained during authentication (session token, CSRF key)
|
| 482 |
-
STATIC — fixed value known from the application (e.g., store_id = 1)
|
| 483 |
-
DERIVED — computed from another value (e.g., cart_id = quote_id)
|
| 484 |
-
```
|
| 485 |
-
|
| 486 |
-
For each task, the judge knows which parameters fall into which category, and whether the model correctly sourced each value. This is how partial credit works — the model gets reward for correctly threading a `cart_id` even if the final call had a wrong field elsewhere.
|
| 487 |
|
| 488 |
### Reward Space
|
| 489 |
|
| 490 |
**Per-step:**
|
| 491 |
|
| 492 |
-
|
| 493 |
-
|
|
| 494 |
-
|
|
| 495 |
-
|
|
| 496 |
-
|
|
| 497 |
-
|
|
| 498 |
-
|
|
| 499 |
-
|
|
| 500 |
-
|
|
| 501 |
-
|
|
| 502 |
-
| 4xx response (recoverable) | −0.05 | call failed but episode continues |
|
| 503 |
-
|
| 504 |
-
|
| 505 |
-
Note: `search_endpoints`, `search_episode_data`, and `done` carry no direct per-step reward. Using `search_endpoints` to find the correct schema is indirectly rewarded by enabling correct parameter sourcing (+0.25) in the curl call that follows. `search_episode_data` is indirectly rewarded by allowing the model to retrieve the correct value to place in the next curl command.
|
| 506 |
|
| 507 |
**Episode end:**
|
| 508 |
|
| 509 |
-
|
| 510 |
| Outcome | Reward |
|
| 511 |
| ----------------------------------------------------------- | ------------------------------------------ |
|
| 512 |
| Task completed correctly | +2.0 to +5.0 (scales with difficulty tier) |
|
|
@@ -514,139 +273,148 @@ Note: `search_endpoints`, `search_episode_data`, and `done` carry no direct per-
|
|
| 514 |
| Authentication correctly obtained (even if task fails) | +0.3 |
|
| 515 |
| Timeout / task failed entirely | −1.5 |
|
| 516 |
|
| 517 |
-
|
| 518 |
Target signal separation: successful episodes `+3` to `+7`, failed episodes `−2` to `−1`. Required for GRPO.
|
| 519 |
|
| 520 |
-
> **Reward design
|
| 521 |
-
>
|
| 522 |
-
> 1. **Hard ceiling on step rewards per episode.** Maximum achievable per-step reward over 20 steps is bounded: `20 × (0.2 + 0.1 + 0.25 + 0.1) = 13`. But a failed episode still ends at `−1.5`, so any correct episode completion still produces a substantially better total.
|
| 523 |
-
> 2. **Curriculum learning as the primary defense.** Easy tasks (Template 1: single GET, no auth) have a trivially short optimal path (2 steps). There is no room to accumulate "fake" exploration reward when the optimal episode only needs 2 calls. The model learns that the terminal reward is the only thing that matters before it encounters tasks long enough to be gamed. Medium and Hard tiers are introduced only after the model reliably solves Easy — by then the behavior pattern is already anchored. This mirrors how SWE-gym-style environments scale difficulty: start simple enough that the reward signal is unambiguous, then broaden.
|
| 524 |
-
>
|
| 525 |
-
> **Premature `done()` penalty:** If the judge scores the final state as incorrect (task not completed), the episode ends at `−1.5`. There is no bonus for calling `done()` early — it is strictly worse than continuing to make correct API calls. The model only benefits from calling `done()` when the task is actually complete.
|
| 526 |
-
|
| 527 |
-
**Reset behavior:** `reset()` clears session state, episode history, episode BM25 index, step counter. It does not reset the remote application database. The judge evaluates relative state (did the cart contain the item?), not absolute state (is the DB row count exactly N?).
|
| 528 |
|
| 529 |
---
|
| 530 |
|
| 531 |
-
##
|
| 532 |
|
| 533 |
-
|
| 534 |
|
| 535 |
-
|
| 536 |
|
| 537 |
-
|
| 538 |
|
| 539 |
-
|
| 540 |
|
| 541 |
-
|
|
|
|
|
|
|
| 542 |
|
| 543 |
-
```
|
| 544 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 545 |
```
|
| 546 |
|
| 547 |
-
**
|
| 548 |
|
| 549 |
-
|
| 550 |
-
2. The environment returns the HTML body, truncated to 3,000 characters (raised from 1,000 specifically to ensure hidden input fields near the end of small forms are included).
|
| 551 |
-
3. The model reads the `value` attribute of `input[name="_csrf_token"]` directly from the returned HTML string. HTML parsing is not required — the token appears as a predictable plain-text pattern in the markup.
|
| 552 |
-
4. The model places the extracted token into the subsequent POST body or form field.
|
| 553 |
-
5. The environment auto-extracts any `Set-Cookie` header from the login response into `session_state`, so subsequent requests are automatically authenticated.
|
| 554 |
|
| 555 |
-
|
| 556 |
|
| 557 |
-
|
| 558 |
-
# Forum login flow
|
| 559 |
-
curl -X POST 'http://ec2-.../login' \
|
| 560 |
-
-H 'Content-Type: application/x-www-form-urlencoded' \
|
| 561 |
-
-d '_csrf_token=abc123XYZ&_username=user&_password=pass'
|
| 562 |
-
# → 302 redirect + Set-Cookie: PHPSESSID=... (auto-injected into session_state)
|
| 563 |
-
|
| 564 |
-
# Forum post creation
|
| 565 |
-
curl -X POST 'http://ec2-.../f/general/submit' \
|
| 566 |
-
-H 'Content-Type: application/x-www-form-urlencoded' \
|
| 567 |
-
-d '_csrf_token=abc123XYZ&title=My+Post&body=Hello+World'
|
| 568 |
-
```
|
| 569 |
|
| 570 |
-
|
|
|
|
| 571 |
|
| 572 |
-
|
| 573 |
|
| 574 |
-
|
| 575 |
|
| 576 |
-
```json
|
| 577 |
-
{
|
| 578 |
-
"status_code": 200,
|
| 579 |
-
"headers": {"Content-Type": "text/html"},
|
| 580 |
-
"body": "<html>...<ul class='mw-search-results'><li><a href='/wiki/Mars'>Mars</a>...</ul>..."
|
| 581 |
-
}
|
| 582 |
```
|
|
|
|
| 583 |
|
| 584 |
-
|
|
|
|
|
|
|
| 585 |
|
| 586 |
-
|
|
|
|
| 587 |
|
| 588 |
-
|
|
|
|
| 589 |
|
| 590 |
-
|
| 591 |
-
|
| 592 |
-
- No `Content-Type` (GET requests) → response indexed based on `Content-Type` of the response
|
| 593 |
|
| 594 |
-
|
|
|
|
| 595 |
|
| 596 |
-
|
|
|
|
| 597 |
|
| 598 |
-
|
|
|
|
| 599 |
|
| 600 |
-
|
|
|
|
| 601 |
|
| 602 |
-
|
| 603 |
|
| 604 |
-
|
| 605 |
|
|
|
|
| 606 |
|
| 607 |
-
|
| 608 |
-
| ----------------------------- | --------------------------------------------------------------- |
|
| 609 |
-
| `{category_name}` | `GET /rest/V1/categories` — all leaf category names |
|
| 610 |
-
| `{product_name}` | `GET /rest/V1/products?pageSize=200` — all product names + SKUs |
|
| 611 |
-
| `{forum_category}` | Forum's category listing API |
|
| 612 |
-
| `{title}`, `{sku}`, `{price}` | Generated or sampled from existing product names |
|
| 613 |
|
|
|
|
|
|
|
|
|
|
| 614 |
|
| 615 |
-
|
| 616 |
|
| 617 |
-
|
|
|
|
|
|
|
|
|
|
| 618 |
|
| 619 |
-
|
|
|
|
| 620 |
|
|
|
|
|
|
|
| 621 |
|
| 622 |
-
|
| 623 |
-
|
| 624 |
-
|
| 625 |
-
| Medium | Auth + 1–2 dependent calls | 2–3 |
|
| 626 |
-
| Hard | Multi-step chain with ID threading, full auth | 4–8+ |
|
| 627 |
|
|
|
|
|
|
|
|
|
|
| 628 |
|
| 629 |
-
###
|
| 630 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 631 |
|
| 632 |
-
|
| 633 |
-
| --- | ------ | -------------- | ------------------------------------------------------ | ------------------------------------------------------- |
|
| 634 |
-
| 1 | Easy | Shopping | List products in category `{category_name}` | Single GET with query params |
|
| 635 |
-
| 2 | Easy | Wikipedia | Retrieve article summary for `{title}` | Single GET, path parameter resolution |
|
| 636 |
-
| 3 | Medium | Shopping | Add `{product_name}` to a guest cart | 2 calls: create cart → add item; ID threading |
|
| 637 |
-
| 4 | Medium | Forum | Retrieve all posts in `{forum_category}` (authed) | Login → extract session → GET |
|
| 638 |
-
| 5 | Hard | Forum | Create a post titled `{title}` in `{category}` | Login → extract CSRF `form_key` → POST with full schema |
|
| 639 |
-
| 6 | Hard | Shopping | Guest checkout for `{product_name}` | 5+ chained calls; cart → item → shipping → payment |
|
| 640 |
-
| 7 | Hard | Shopping Admin | Create a new product with SKU `{sku}`, price `{price}` | Admin bearer token → full Magento product schema |
|
| 641 |
|
|
|
|
|
|
|
|
|
|
| 642 |
|
| 643 |
-
|
| 644 |
-
|
| 645 |
-
-
|
| 646 |
-
-
|
| 647 |
-
- **Hard graders**: verify multi-step state change in the application (e.g., post exists, checkout created)
|
| 648 |
|
| 649 |
-
|
|
|
|
|
|
|
| 650 |
|
| 651 |
---
|
| 652 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
# HARvestGym
|
| 17 |
|
| 18 |
+
### Can a small model learn to reverse-engineer any web application's API — and complete real tasks through those APIs, without ever opening a browser?
|
| 19 |
|
| 20 |
+
Web applications are full of APIs. Every click in a browser triggers an HTTP call with a precise schema, a specific authentication header, an exact sequence of prerequisites. **HARvestGym trains a small model to do all of that directly** — given a task and a URL, it discovers the relevant endpoints, figures out what each one needs, chains the calls in the right order, and completes the task without any browser.
|
|
|
|
|
|
|
| 21 |
|
| 22 |
The model starts with nothing: no schema, no documentation, no endpoint list. It uses tools to explore — issuing requests, inspecting responses, building up its own understanding of how the application works. This is what a developer does when they reverse-engineer an API. The model learns to do the same.
|
| 23 |
|
| 24 |
+
---
|
| 25 |
|
| 26 |
+
## How It Works
|
| 27 |
|
| 28 |
+
```
|
| 29 |
+
Task + App URL
|
| 30 |
+
│
|
| 31 |
+
▼
|
| 32 |
+
Policy Model (RL Agent)
|
| 33 |
+
small model — no prior knowledge of the app
|
| 34 |
+
|
| 35 |
+
Step 1 ──► browser_agent(task, url) → filtered API endpoint list
|
| 36 |
+
Step 2+ ──► search_endpoints(query) → full schema for a specific endpoint
|
| 37 |
+
──► curl_exec(command) → execute HTTP call, get response
|
| 38 |
+
──► search_episode_data(query) → search prior response bodies
|
| 39 |
+
──► done(result) → declare task complete
|
| 40 |
+
│
|
| 41 |
+
▼
|
| 42 |
+
Live WebArena Apps (EC2) ←── real HTTP responses (always live, never mocked)
|
| 43 |
+
│
|
| 44 |
+
▼
|
| 45 |
+
Deterministic Judge (compares against ground truth API catalog)
|
| 46 |
+
│
|
| 47 |
+
▼
|
| 48 |
+
Reward Signal ──► GRPO ──► updated policy
|
| 49 |
+
```
|
| 50 |
|
| 51 |
+
The agent calls `browser_agent` once at the start — this runs a real browser to complete the same task while recording all network traffic, then returns the filtered list of API endpoints observed. The agent now has a map of what endpoints exist. What it does *not* know:
|
| 52 |
|
| 53 |
- which of those endpoints are actually needed for this specific task
|
| 54 |
- in what order they must be called (you cannot add to a cart before the cart exists)
|
| 55 |
- where each required parameter value comes from
|
| 56 |
- how to re-authenticate if a session expires mid-episode
|
| 57 |
|
| 58 |
+
The model must learn to discover all of this on its own.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
---
|
| 61 |
|
|
|
|
| 70 |
│ ▼ │
|
| 71 |
│ ┌────────────────────────────────────────────────────────────────┐ │
|
| 72 |
│ │ Policy Model (RL Agent) │ │
|
| 73 |
+
│ │ small model — no prior knowledge of the app │ │
|
| 74 |
│ │ │ │
|
| 75 |
+
│ │ Observation: task + history + session_state + last_result │ │
|
| 76 |
│ │ │ │
|
| 77 |
+
│ │ Step 1 ──► browser_agent(task, url) │ │
|
| 78 |
+
│ │ Step 2+ ──► search_endpoints(query) │ │
|
| 79 |
+
│ │ ──► curl_exec(command) │ │
|
| 80 |
+
│ │ ──► search_episode_data(query) │ │
|
| 81 |
+
│ │ ──► done(result) │ │
|
| 82 |
│ └────────┬───────────────────────────────────────────────────────┘ │
|
| 83 |
│ │ │
|
| 84 |
+
│ ┌──────┴──────────────────────────────┐ │
|
| 85 |
+
│ │ │ │
|
| 86 |
+
│ ▼ ▼ │
|
| 87 |
+
│ ┌─────────────────────┐ ┌─────────────────────────────────────┐ │
|
| 88 |
+
│ │ Browser Agent │ │ Environment │ │
|
| 89 |
+
│ │ (step 1 only) │ │ │ │
|
| 90 |
+
│ │ │ │ • Executes curl_exec via subprocess│ │
|
| 91 |
+
│ │ Training: │ │ • Auto-injects session cookies │ │
|
| 92 |
+
│ │ Load pre-recorded │ │ • Smart-truncates response bodies │ │
|
| 93 |
+
│ │ cached HAR from │ │ • Indexes full responses into │ │
|
| 94 |
+
│ │ disk or launch │ │ per-episode BM25 + GEMMA store │ │
|
| 95 |
+
│ │ on real browser │ │ • Manages session_state: cookies, │ │
|
| 96 |
+
│ │ │ │ CSRF tokens, auth headers │ │
|
| 97 |
+
│ │ Inference: │ ���──────────────┬──────────────────────┘ │
|
| 98 |
+
│ │ Launch real browser│ │ │
|
| 99 |
+
│ │ via Playwright + │ │ HTTP calls (always live) │
|
| 100 |
+
│ │ bu-30b-a3b-preview │ ▼ │
|
| 101 |
+
│ │ │ ┌─────────────────────────────────────┐ │
|
| 102 |
+
│ │ Both paths produce: │ │ WebArena EC2 (live apps) │ │
|
| 103 |
+
│ │ • Filtered HAR │ │ │ │
|
| 104 |
+
│ │ • OpenAPI-like spec│ │ :7770 Shopping (Magento 2) │ │
|
| 105 |
+
│ │ • GEMMA embeddings │ │ :7780 Shopping Admin │ │
|
| 106 |
+
│ │ for search_ │ │ :9999 Forum (Postmill) │ │
|
| 107 |
+
│ │ endpoints() │ │ :8888 Wikipedia (Kiwix) │ │
|
| 108 |
+
│ └─────────────────────┘ │ :3000 Map (OpenStreetMap) │ │
|
| 109 |
+
│ └──────────────┬──────────────────────┘ │
|
| 110 |
+
│ │ │
|
| 111 |
+
│ │ episode trajectory │
|
| 112 |
+
│ ▼ │
|
| 113 |
+
│ ┌─────────────────────────────────────┐ │
|
| 114 |
+
│ │ Deterministic Judge │ │
|
| 115 |
+
│ │ │ │
|
| 116 |
+
│ │ Per-template programmatic grader: │ │
|
| 117 |
+
│ │ • Inspects episode trajectory │ │
|
| 118 |
+
│ │ • Optionally probes live app state │ │
|
| 119 |
+
│ │ • Verifies parameter sourcing │ │
|
| 120 |
+
│ │ (TASK_SPEC / PREV_CALL / │ │
|
| 121 |
+
│ │ AUTH_FLOW / STATIC / DERIVED) │ │
|
| 122 |
+
│ │ • Scores [0.0 → 1.0] │ │
|
| 123 |
+
│ └──────────────┬──────────────────────┘ │
|
| 124 |
+
│ │ │
|
| 125 |
+
│ ▼ │
|
| 126 |
+
│ ┌─────────────────────────────────────┐ │
|
| 127 |
+
│ │ Reward Signal │ │
|
| 128 |
+
│ │ │ │
|
| 129 |
+
│ │ Per-step: │ │
|
| 130 |
+
│ │ +0.2 valid API call (2xx) │ │
|
| 131 |
+
│ │ +0.1 new path explored │ │
|
| 132 |
+
│ │ +0.25 correct param sourcing │ │
|
| 133 |
+
│ │ −0.15 repeated identical call │ │
|
| 134 |
+
│ │ −0.3 browser_agent called again │ │
|
| 135 |
+
│ │ │ │
|
| 136 |
+
│ │ Episode end: │ │
|
| 137 |
+
│ │ +2.0–+5.0 task complete (easy→hard│ │
|
| 138 |
+
│ │ −1.5 task failed │ │
|
| 139 |
+
│ └──────────────┬──────────────────────┘ │
|
| 140 |
+
│ │ │
|
| 141 |
+
│ ▼ │
|
| 142 |
+
│ ┌─────────────────────────────────────┐ │
|
| 143 |
+
│ │ GRPO (via HF TRL) │ │
|
| 144 |
+
│ │ │ │
|
| 145 |
+
│ │ 8 parallel rollouts per prompt │ │
|
| 146 |
+
│ │ Computes advantages without │ │
|
| 147 |
+
│ │ a value function │ │
|
| 148 |
+
│ │ Updates policy weights │ │
|
| 149 |
+
│ └────��────────────────────────────────┘ │
|
| 150 |
+
│ │ │
|
| 151 |
+
│ └──► updated Policy Model │
|
| 152 |
└─────────────────────────────────────────────────────────────────────────┘
|
| 153 |
```
|
| 154 |
|
| 155 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 156 |
|
| 157 |
+
## Target Applications
|
| 158 |
|
| 159 |
+
All running on a single AWS EC2 instance — real production software, no simulation.
|
| 160 |
|
| 161 |
+
| App | Port | Software |
|
| 162 |
+
| -------------- | ---- | ------------------------------------------------- |
|
| 163 |
+
| Shopping | 7770 | Magento 2 — open-source e-commerce platform |
|
| 164 |
+
| Shopping Admin | 7780 | Magento 2 Admin — backend panel for the same store|
|
| 165 |
+
| Forum | 9999 | Postmill — open-source Reddit-like forum |
|
| 166 |
+
| Wikipedia | 8888 | Kiwix — read-only offline mirror of Wikipedia |
|
| 167 |
+
| Map | 3000 | OpenStreetMap — collaborative mapping platform |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 168 |
|
| 169 |
+
Source: [WebArena environment_docker](https://github.com/web-arena-x/webarena/tree/main/environment_docker)
|
| 170 |
|
| 171 |
+
---
|
| 172 |
|
| 173 |
+
## Tasks
|
| 174 |
|
| 175 |
+
HARvestGym trains on **7 task templates** across three complexity tiers. Each template is a parameterized scenario: one reward function, one ground truth catalog entry, one grader — but potentially hundreds of distinct episode variations produced by substituting different values for the template slots (`{product_name}`, `{category_name}`, etc.).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 176 |
|
| 177 |
+
### Complexity Tiers
|
| 178 |
|
| 179 |
+
| Tier | Characteristic | API calls required |
|
| 180 |
+
| ------ | --------------------------------------------- | ------------------ |
|
| 181 |
+
| Easy | Single call, no auth | 1 |
|
| 182 |
+
| Medium | Auth + 1–2 dependent calls | 2–3 |
|
| 183 |
+
| Hard | Multi-step chain with ID threading, full auth | 4–8+ |
|
| 184 |
|
| 185 |
+
The model only graduates to harder templates once it reliably solves easier ones.
|
| 186 |
|
| 187 |
+
### Task Templates
|
| 188 |
|
| 189 |
+
| # | Tier | App | Template | Key Challenge |
|
| 190 |
+
| --- | ------ | -------------- | ------------------------------------------------------ | ------------------------------------------------------- |
|
| 191 |
+
| 1 | Easy | Shopping | List products in category `{category_name}` | Single GET with query params |
|
| 192 |
+
| 2 | Easy | Wikipedia | Retrieve article summary for `{title}` | Single GET, path parameter resolution |
|
| 193 |
+
| 3 | Medium | Shopping | Add `{product_name}` to a guest cart | 2 calls: create cart → add item; ID threading |
|
| 194 |
+
| 4 | Medium | Forum | Retrieve all posts in `{forum_category}` (authed) | Login → extract session → GET |
|
| 195 |
+
| 5 | Hard | Forum | Create a post titled `{title}` in `{category}` | Login → extract CSRF `form_key` → POST with full schema |
|
| 196 |
+
| 6 | Hard | Shopping | Guest checkout for `{product_name}` | 5+ chained calls; cart → item → shipping → payment |
|
| 197 |
+
| 7 | Hard | Shopping Admin | Create a new product with SKU `{sku}`, price `{price}` | Admin bearer token → full Magento product schema |
|
| 198 |
|
| 199 |
+
**Template parameters** are populated from a static parameter pool built by querying the live applications before training (see `parameter_pools.json`, refreshed via `scripts/build_parameter_pools.py`). Each episode samples randomly from its pool — the model never sees the pool directly, it must discover the correct values through its own API calls.
|
| 200 |
|
| 201 |
+
Each task has a deterministic programmatic grader (score in `[0.0, 1.0]`):
|
| 202 |
+
- **Easy graders**: check HTTP response body for expected values
|
| 203 |
+
- **Medium graders**: probe application state after episode (e.g., fetch the cart, verify item is present)
|
| 204 |
+
- **Hard graders**: verify multi-step state change in the application (e.g., post exists, checkout created)
|
| 205 |
|
| 206 |
---
|
| 207 |
|
|
|
|
| 215 |
class Observation(BaseModel):
|
| 216 |
task: str # Natural language task
|
| 217 |
app_base_url: str # Root URL of the target application
|
| 218 |
+
last_tool_result: Any # Result of last tool call
|
| 219 |
+
history: list[dict] # Full episode trajectory: [{action, tool_result}, ...]
|
| 220 |
+
session_state: dict # Auto-managed: cookies, tokens, CSRF values
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 221 |
step_count: int
|
| 222 |
+
max_steps: int # 20
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 223 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 224 |
|
| 225 |
+
`session_state` is maintained by the environment — the model decides *when* to authenticate and *which* session values to use; the environment handles *extraction* from `Set-Cookie` headers and response bodies.
|
|
|
|
| 226 |
|
| 227 |
+
**Response truncation** rules applied in order:
|
| 228 |
+
1. Non-JSON body (HTML, CSS): truncated to 3,000 characters
|
| 229 |
+
2. JSON primitive (string, number): never truncated — these are tokens, IDs
|
| 230 |
+
3. Error response (4xx/5xx): never truncated — the model needs every word to self-correct
|
| 231 |
+
4. Small JSON (no large arrays): returned as-is
|
| 232 |
+
5. Large JSON array (≥ 3 items): first 2 items shown + `_list_truncated` annotation + hint to call `search_episode_data()`
|
| 233 |
|
| 234 |
+
Every `curl_exec` call indexes the *full* response into a per-episode hybrid index (BM25 + GEMMA embeddings) *before* truncation — so all items are always retrievable even when only 2 were shown.
|
|
|
|
| 235 |
|
| 236 |
+
### Action Space
|
|
|
|
|
|
|
|
|
|
|
|
|
| 237 |
|
| 238 |
+
The model outputs a single tool call per step.
|
| 239 |
|
| 240 |
+
| Tool | Input | Output |
|
| 241 |
+
| ---------------------------- | --------------------------------- | ------------------------------------------------------------------------------- |
|
| 242 |
+
| `browser_agent(task, url)` | Task string + app base URL | Summary list of API endpoint names + methods (e.g. `GET /products`) |
|
| 243 |
+
| `search_endpoints(query)` | Natural language query | Top-3 endpoint schemas (method, path, auth, params with sources, response fields)|
|
| 244 |
+
| `curl_exec(command)` | Full curl command string | `{status_code, headers, body}` — body smart-truncated; full body indexed |
|
| 245 |
+
| `search_episode_data(query)` | Keyword or natural language query | Top-5 JSON objects from this episode's request/response history |
|
| 246 |
+
| `done(result?)` | Optional result string | Ends episode, triggers judge evaluation |
|
| 247 |
|
| 248 |
+
`browser_agent` is called **exactly once per episode at step 1**. Calling it again applies a −0.3 penalty. During training, it loads a cached HAR file; at inference, it launches a live browser session.
|
| 249 |
|
| 250 |
+
Full technical specifications for all tools: [`TOOLS.md`](./TOOLS.md)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 251 |
|
| 252 |
### Reward Space
|
| 253 |
|
| 254 |
**Per-step:**
|
| 255 |
|
| 256 |
+
| Signal | Value | Trigger |
|
| 257 |
+
| ---------------------------- | ------ | -------------------------------------------------------------------- |
|
| 258 |
+
| Valid API call (2xx) | +0.2 | `curl_exec` returns 2xx status |
|
| 259 |
+
| New path called this episode | +0.1 | Normalized path not called before — discourages looping |
|
| 260 |
+
| Correct parameter sourcing | +0.25 | Judge: value came from the correct source type |
|
| 261 |
+
| Session value correctly used | +0.1 | Auth token/cookie present and correct in curl call |
|
| 262 |
+
| Repeated identical call | −0.15 | Exact duplicate curl command issued twice |
|
| 263 |
+
| browser_agent called again | −0.3 | `browser_agent` called after step 1 |
|
| 264 |
+
| Malformed curl command | −0.1 | curl cannot be parsed or executed |
|
| 265 |
+
| 4xx response (recoverable) | −0.05 | Call failed but episode continues |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 266 |
|
| 267 |
**Episode end:**
|
| 268 |
|
|
|
|
| 269 |
| Outcome | Reward |
|
| 270 |
| ----------------------------------------------------------- | ------------------------------------------ |
|
| 271 |
| Task completed correctly | +2.0 to +5.0 (scales with difficulty tier) |
|
|
|
|
| 273 |
| Authentication correctly obtained (even if task fails) | +0.3 |
|
| 274 |
| Timeout / task failed entirely | −1.5 |
|
| 275 |
|
|
|
|
| 276 |
Target signal separation: successful episodes `+3` to `+7`, failed episodes `−2` to `−1`. Required for GRPO.
|
| 277 |
|
| 278 |
+
> **Reward design note:** Pure step-level rewards can teach a model to "look busy" — accumulating exploration rewards while never completing the task. The terminal outcome reward is designed to dominate the sum of all per-step rewards. The curriculum is the primary defense: Easy tasks have a trivially short optimal path (2 steps), so there's no room to accumulate fake exploration reward before the model learns that the terminal reward is what matters.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 279 |
|
| 280 |
---
|
| 281 |
|
| 282 |
+
## Key Design Decisions
|
| 283 |
|
| 284 |
+
### Browser Agent as a Discovery Tool
|
| 285 |
|
| 286 |
+
The RL agent has access to a **browser agent tool** powered by [`bu-30b-a3b-preview`](https://huggingface.co/browser-use/bu-30b-a3b-preview) — a 30B MoE vision-language model (3B active parameters) served via the [browser-use](https://github.com/browser-use/browser-use) library on Playwright. When called, it completes the task in a real browser while intercepting all network traffic, then returns the filtered API call list.
|
| 287 |
|
| 288 |
+
**Training vs. inference:** The browser agent output is pre-computed and cached per task during training — the RL model receives it instantly, no live browser session runs. At inference, the browser agent runs live to handle novel tasks.
|
| 289 |
|
| 290 |
+
Full details: [`BROWSER_AGENT.md`](BROWSER_AGENT.md)
|
| 291 |
|
| 292 |
+
### Ground Truth from the Codebase, Not the Browser
|
| 293 |
+
|
| 294 |
+
The browser agent shows *what* API calls happen. It does not explain *why* — where each parameter comes from or what field constraints exist. That comes from a one-time static analysis of each WebArena application's Docker image source, producing a **ground truth API catalog**:
|
| 295 |
|
| 296 |
+
```
|
| 297 |
+
endpoint: POST /rest/V1/guest-carts/{cartId}/items
|
| 298 |
+
path_params:
|
| 299 |
+
cartId: obtained from: POST /rest/V1/guest-carts → response body
|
| 300 |
+
body:
|
| 301 |
+
cartItem.sku: the product's SKU, from: GET /rest/V1/products → items[].sku
|
| 302 |
+
cartItem.qty: quantity, from: task specification
|
| 303 |
+
cartItem.quote_id: same as cartId
|
| 304 |
```
|
| 305 |
|
| 306 |
+
The judge uses this to verify not just *what* the model called, but *where each parameter value came from*. Source types: `TASK_SPEC`, `PREV_CALL`, `AUTH_FLOW`, `STATIC`, `DERIVED`. This is how partial credit works — the model gets reward for correctly threading a `cart_id` even if the final call had a wrong field elsewhere.
|
| 307 |
|
| 308 |
+
Full extraction process: [`GROUND_TRUTH_EXTRACTION.md`](GROUND_TRUTH_EXTRACTION.md)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 309 |
|
| 310 |
+
### HTML and Form-Based Applications
|
| 311 |
|
| 312 |
+
Not every endpoint returns JSON. The Forum (Postmill) relies on HTML form submissions with CSRF tokens; Wikipedia (Kiwix) serves static HTML pages. The agent handles both:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 313 |
|
| 314 |
+
- **CSRF tokens**: The model GETs the form page, reads the `value` attribute of `input[name="_csrf_token"]` from the returned HTML, and places it in the subsequent POST. If the token is beyond the 3,000-character truncation point, it calls `search_episode_data("_csrf_token")` — the full HTML is indexed before truncation.
|
| 315 |
+
- **HTML-only responses**: Wikipedia responses are returned in the standard `{status_code, headers, body}` envelope. Search results appear in `<a href>` tags; article content in `<p>` tags.
|
| 316 |
|
| 317 |
+
---
|
| 318 |
|
| 319 |
+
## Example Episode
|
| 320 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 321 |
```
|
| 322 |
+
TASK: Add "Radiant Tee" to a guest cart at http://ec2-16-59-2-56.../
|
| 323 |
|
| 324 |
+
STEP 1: browser_agent("Add Radiant Tee to a guest cart", "http://ec2-...:7770/")
|
| 325 |
+
→ {"endpoints": ["POST /rest/V1/guest-carts", "GET /rest/V1/products",
|
| 326 |
+
"POST /rest/V1/guest-carts/{id}/items", ...]}
|
| 327 |
|
| 328 |
+
STEP 2: search_endpoints("create guest cart")
|
| 329 |
+
→ ["endpoint: POST /rest/V1/guest-carts | auth: none | returns: string (cartId)"]
|
| 330 |
|
| 331 |
+
STEP 3: curl_exec("curl -X POST 'http://ec2-.../rest/V1/guest-carts' -H 'Content-Type: application/json'")
|
| 332 |
+
→ {status_code: 200, body: "cart-abc123"}
|
| 333 |
|
| 334 |
+
STEP 4: search_endpoints("find product by name get sku")
|
| 335 |
+
→ ["endpoint: GET /rest/V1/products | query: searchCriteria filters | returns: .items[].sku"]
|
|
|
|
| 336 |
|
| 337 |
+
STEP 5: curl_exec("curl 'http://ec2-.../rest/V1/products?searchCriteria[filter_groups][0][filters][0][field]=name&...[value]=Radiant+Tee'")
|
| 338 |
+
→ {status_code: 200, body: {"items":[{"sku":"MH01","name":"Radiant Tee","price":22.0}]}}
|
| 339 |
|
| 340 |
+
STEP 6: search_endpoints("add item to guest cart cartId")
|
| 341 |
+
→ ["endpoint: POST /rest/V1/guest-carts/{cartId}/items | path: cartId from POST /rest/V1/guest-carts | body: cartItem.sku, cartItem.qty, cartItem.quote_id"]
|
| 342 |
|
| 343 |
+
STEP 7: curl_exec("curl -X POST 'http://ec2-.../rest/V1/guest-carts/cart-abc123/items' -H 'Content-Type: application/json' -d '{\"cartItem\":{\"sku\":\"MH01\",\"qty\":1,\"quote_id\":\"cart-abc123\"}}'")
|
| 344 |
+
→ {status_code: 200, body: {"item_id": 5, "sku": "MH01", "qty": 1}}
|
| 345 |
|
| 346 |
+
STEP 8: done("Radiant Tee added to cart")
|
| 347 |
+
```
|
| 348 |
|
| 349 |
+
Values from prior responses (`cart-abc123`, `MH01`) are threaded directly from the growing episode history. If step 5 had returned 200 products truncated to 2, the model would call `search_episode_data("Radiant Tee sku")` to retrieve `MH01` from the episode index.
|
| 350 |
|
| 351 |
+
---
|
| 352 |
|
| 353 |
+
## Setup
|
| 354 |
|
| 355 |
+
### Prerequisites
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 356 |
|
| 357 |
+
- Docker installed and running
|
| 358 |
+
- Python 3.11+ with [`uv`](https://github.com/astral-sh/uv)
|
| 359 |
+
- A Hugging Face token with read access
|
| 360 |
|
| 361 |
+
### Local Development
|
| 362 |
|
| 363 |
+
```bash
|
| 364 |
+
# Clone and enter the project
|
| 365 |
+
git clone <your-hf-space-url>
|
| 366 |
+
cd HARvestGym
|
| 367 |
|
| 368 |
+
# Install dependencies
|
| 369 |
+
uv sync
|
| 370 |
|
| 371 |
+
# Validate the OpenEnv spec
|
| 372 |
+
openenv validate
|
| 373 |
|
| 374 |
+
# Build and run the Docker image
|
| 375 |
+
docker build -t harvgym .
|
| 376 |
+
docker run -p 8000:8000 harvgym
|
|
|
|
|
|
|
| 377 |
|
| 378 |
+
# Run the inference script
|
| 379 |
+
HF_TOKEN=hf_xxx uv run inference.py
|
| 380 |
+
```
|
| 381 |
|
| 382 |
+
### Environment Variables
|
| 383 |
|
| 384 |
+
| Variable | Default | Required | Purpose |
|
| 385 |
+
| -------------- | ------------------------------------ | -------- | ----------------------------------------- |
|
| 386 |
+
| `HF_TOKEN` | — | **Yes** | HuggingFace auth token |
|
| 387 |
+
| `API_BASE_URL` | `https://router.huggingface.co/v1` | No | LLM API endpoint |
|
| 388 |
+
| `MODEL_NAME` | `google/gemma-4-31B-it` | No | Model for inference |
|
| 389 |
+
| `HARVGYM_TASK` | `har_classify_easy` | No | Override which task to run |
|
| 390 |
|
| 391 |
+
### API Endpoints
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 392 |
|
| 393 |
+
```bash
|
| 394 |
+
# Reset episode
|
| 395 |
+
curl -X POST http://localhost:8000/reset
|
| 396 |
|
| 397 |
+
# Execute a step
|
| 398 |
+
curl -X POST http://localhost:8000/step \
|
| 399 |
+
-H "Content-Type: application/json" \
|
| 400 |
+
-d '{"tool": "browser_agent", "args": {"task": "...", "url": "..."}}'
|
|
|
|
| 401 |
|
| 402 |
+
# Get current state
|
| 403 |
+
curl http://localhost:8000/state
|
| 404 |
+
```
|
| 405 |
|
| 406 |
---
|
| 407 |
|
| 408 |
+
## Baseline Performance
|
| 409 |
+
|
| 410 |
+
Scores generated by running `uv run inference.py` with `google/gemma-4-31B-it` via the HuggingFace Router.
|
| 411 |
+
|
| 412 |
+
| Task | Difficulty | Score | Steps | Result | Notes |
|
| 413 |
+
| ---- | ---------- | ----- | ----- | ------ | ----- |
|
| 414 |
+
| `easy_list_pants` | Easy | **0.74** | 6 | PASS | List products in 'Pants' category |
|
| 415 |
+
| `medium_cart_camera_backpack` | Medium | **0.56** | 20 | PASS | Add Camera Backpack to guest cart |
|
| 416 |
+
| `medium_cart_flannel_jacket` | Medium | **0.60** | 20 | PASS | Add Flannel Jacket to guest cart |
|
| 417 |
+
| `hard_checkout_ripstop_pants` | Hard | **0.22** | 20 | FAIL | Full guest checkout (hit step limit) |
|
| 418 |
+
| **Overall** | — | **0.53** | — | **3/4 passed** | |
|
| 419 |
+
|
| 420 |
+
> **To regenerate:** `HF_TOKEN=hf_xxx uv run inference.py`
|
hars/forum.har
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
hars/shopping.har
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:878c65126d999ef91d6b75438431f7c1b9164ac580140bd7ca61ef693cacd76c
|
| 3 |
+
size 115555293
|
hars/shopping_admin.har
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ce2209be9f3265b0a1682935171fb932c0056bc67b7517419b3ef5239c2ba2be
|
| 3 |
+
size 148077790
|
hars/wikipedia.har
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
inference.py
CHANGED
|
@@ -29,39 +29,65 @@ Usage:
|
|
| 29 |
import asyncio
|
| 30 |
import json
|
| 31 |
import os
|
|
|
|
| 32 |
import sys
|
| 33 |
import textwrap
|
|
|
|
| 34 |
from typing import Any, List, Optional
|
| 35 |
|
| 36 |
from openai import OpenAI
|
| 37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
# ---------------------------------------------------------------------------
|
| 39 |
# Configuration — auto-detect provider from env vars
|
| 40 |
# ---------------------------------------------------------------------------
|
| 41 |
|
| 42 |
-
|
| 43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
|
|
|
|
| 45 |
if _OPENROUTER_KEY:
|
| 46 |
-
# OpenRouter mode —
|
| 47 |
API_BASE_URL = os.getenv("API_BASE_URL", "https://openrouter.ai/api/v1")
|
| 48 |
API_KEY = _OPENROUTER_KEY
|
| 49 |
MODEL_NAME = os.getenv("MODEL_NAME", "google/gemma-4-31b-it")
|
| 50 |
-
HF_TOKEN = _HF_TOKEN # still needed for the env server itself
|
| 51 |
print(f"[INFO] Provider: OpenRouter | Model: {MODEL_NAME}", flush=True)
|
| 52 |
-
|
| 53 |
# HuggingFace Inference Router — final submission target
|
| 54 |
API_BASE_URL = os.getenv("API_BASE_URL", "https://router.huggingface.co/v1")
|
| 55 |
-
API_KEY =
|
| 56 |
-
|
| 57 |
-
MODEL_NAME = os.getenv("MODEL_NAME", "Qwen/Qwen2.5-72B-Instruct")
|
| 58 |
print(f"[INFO] Provider: HuggingFace | Model: {MODEL_NAME}", flush=True)
|
| 59 |
-
else:
|
| 60 |
-
raise ValueError(
|
| 61 |
-
"No API key found. Set either:\n"
|
| 62 |
-
" OPENROUTER_API_KEY=sk-or-xxx (for OpenRouter testing)\n"
|
| 63 |
-
" HF_TOKEN=hf_xxx (for HuggingFace submission)"
|
| 64 |
-
)
|
| 65 |
|
| 66 |
# ---------------------------------------------------------------------------
|
| 67 |
# Tool definitions — proper OpenAI function-calling format.
|
|
@@ -79,23 +105,22 @@ TOOLS = [
|
|
| 79 |
"function": {
|
| 80 |
"name": "browser_agent",
|
| 81 |
"description": (
|
| 82 |
-
"Discovers
|
| 83 |
-
"
|
| 84 |
-
"
|
| 85 |
-
"
|
| 86 |
-
"
|
| 87 |
-
"Do NOT call again after step 1."
|
| 88 |
),
|
| 89 |
"parameters": {
|
| 90 |
"type": "object",
|
| 91 |
"properties": {
|
| 92 |
"task": {
|
| 93 |
"type": "string",
|
| 94 |
-
"description": "The
|
| 95 |
},
|
| 96 |
"url": {
|
| 97 |
"type": "string",
|
| 98 |
-
"description": "Base URL of the target application
|
| 99 |
},
|
| 100 |
},
|
| 101 |
"required": ["task", "url"],
|
|
@@ -109,20 +134,19 @@ TOOLS = [
|
|
| 109 |
"function": {
|
| 110 |
"name": "search_endpoints",
|
| 111 |
"description": (
|
| 112 |
-
"
|
| 113 |
-
"Returns matching endpoint schemas
|
| 114 |
-
"
|
| 115 |
-
"Use this
|
| 116 |
-
"before making a curl_exec call. "
|
| 117 |
-
"Examples: 'create guest cart', 'add item to cart', 'set shipping address', "
|
| 118 |
-
"'place order', 'get products by category'."
|
| 119 |
),
|
| 120 |
"parameters": {
|
| 121 |
"type": "object",
|
| 122 |
"properties": {
|
| 123 |
"query": {
|
| 124 |
"type": "string",
|
| 125 |
-
"description": "Natural language description of the
|
|
|
|
|
|
|
| 126 |
},
|
| 127 |
},
|
| 128 |
"required": ["query"],
|
|
@@ -136,14 +160,18 @@ TOOLS = [
|
|
| 136 |
"function": {
|
| 137 |
"name": "curl_exec",
|
| 138 |
"description": (
|
| 139 |
-
"Execute an HTTP request against the live application. "
|
| 140 |
-
"
|
| 141 |
-
"
|
| 142 |
-
"
|
| 143 |
-
"
|
| 144 |
-
"
|
| 145 |
-
"
|
| 146 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
| 147 |
),
|
| 148 |
"parameters": {
|
| 149 |
"type": "object",
|
|
@@ -151,12 +179,10 @@ TOOLS = [
|
|
| 151 |
"command": {
|
| 152 |
"type": "string",
|
| 153 |
"description": (
|
| 154 |
-
"Full curl command string.
|
| 155 |
-
"
|
| 156 |
-
"
|
| 157 |
-
"
|
| 158 |
-
"-H 'Content-Type: application/json' "
|
| 159 |
-
"-d '{\"cartItem\":{\"sku\":\"MH01-XS-Black\",\"qty\":1,\"quote_id\":\"CART_ID\"}}'"
|
| 160 |
),
|
| 161 |
},
|
| 162 |
},
|
|
@@ -171,18 +197,21 @@ TOOLS = [
|
|
| 171 |
"function": {
|
| 172 |
"name": "search_episode_data",
|
| 173 |
"description": (
|
| 174 |
-
"
|
| 175 |
-
"
|
| 176 |
-
"
|
| 177 |
-
"
|
| 178 |
-
"
|
|
|
|
|
|
|
| 179 |
),
|
| 180 |
"parameters": {
|
| 181 |
"type": "object",
|
| 182 |
"properties": {
|
| 183 |
"query": {
|
| 184 |
"type": "string",
|
| 185 |
-
"description": "What
|
|
|
|
| 186 |
},
|
| 187 |
},
|
| 188 |
"required": ["query"],
|
|
@@ -196,58 +225,197 @@ TOOLS = [
|
|
| 196 |
"function": {
|
| 197 |
"name": "done",
|
| 198 |
"description": (
|
| 199 |
-
"Signal that the task is
|
| 200 |
-
"
|
| 201 |
-
"
|
| 202 |
-
"Do NOT call done() as a fallback or when uncertain — it triggers final scoring."
|
| 203 |
),
|
| 204 |
"parameters": {
|
| 205 |
"type": "object",
|
| 206 |
"properties": {
|
| 207 |
"result": {
|
| 208 |
"type": "string",
|
| 209 |
-
"description": "
|
| 210 |
},
|
| 211 |
},
|
| 212 |
"additionalProperties": False,
|
| 213 |
},
|
| 214 |
-
"strict": False,
|
| 215 |
},
|
| 216 |
},
|
| 217 |
]
|
| 218 |
|
| 219 |
BENCHMARK = "harvgym"
|
| 220 |
MAX_STEPS = 20
|
| 221 |
-
TEMPERATURE = 0.2
|
| 222 |
-
MAX_TOKENS =
|
| 223 |
SUCCESS_SCORE_THRESHOLD = 0.5
|
| 224 |
|
| 225 |
-
#
|
| 226 |
-
#
|
| 227 |
-
|
| 228 |
-
|
| 229 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 230 |
"template_id": 1,
|
| 231 |
-
"description": "List products in the 'Gear' category",
|
| 232 |
-
"app_base_url": "http://ec2-16-59-2-56.us-east-2.compute.amazonaws.com:7770/",
|
| 233 |
"difficulty": "easy",
|
| 234 |
-
|
| 235 |
-
|
| 236 |
-
"
|
| 237 |
-
|
| 238 |
-
|
| 239 |
-
|
| 240 |
-
|
| 241 |
-
|
| 242 |
-
{
|
| 243 |
-
"task_name":
|
| 244 |
-
"template_id":
|
| 245 |
-
"
|
| 246 |
-
"
|
| 247 |
-
"
|
| 248 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 249 |
]
|
| 250 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 251 |
# ---------------------------------------------------------------------------
|
| 252 |
# Logging helpers (hackathon format)
|
| 253 |
# ---------------------------------------------------------------------------
|
|
@@ -279,27 +447,35 @@ def log_end(success: bool, steps: int, score: float, rewards: List[float]) -> No
|
|
| 279 |
# ---------------------------------------------------------------------------
|
| 280 |
|
| 281 |
SYSTEM_PROMPT = textwrap.dedent("""
|
| 282 |
-
You are an API agent
|
| 283 |
-
by calling HTTP APIs in the correct
|
| 284 |
|
| 285 |
WORKFLOW:
|
| 286 |
-
1. Call browser_agent
|
| 287 |
-
2.
|
| 288 |
-
3.
|
| 289 |
-
|
| 290 |
-
|
| 291 |
-
|
| 292 |
-
|
| 293 |
-
|
| 294 |
-
|
| 295 |
-
-
|
| 296 |
-
-
|
| 297 |
-
|
| 298 |
-
|
| 299 |
-
-
|
| 300 |
-
-
|
| 301 |
-
|
| 302 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 303 |
""").strip()
|
| 304 |
|
| 305 |
|
|
@@ -334,12 +510,10 @@ def build_user_prompt(task_desc: str, app_base_url: str, step: int,
|
|
| 334 |
"""Build the user prompt for each step."""
|
| 335 |
history_lines = []
|
| 336 |
if history:
|
| 337 |
-
|
| 338 |
-
for h in history[-8:]:
|
| 339 |
result = h.get("result", {})
|
| 340 |
-
# For curl results: show status_code + first 200 chars of body
|
| 341 |
if isinstance(result, dict) and "status_code" in result:
|
| 342 |
-
body_preview = str(result.get("body", ""))[:
|
| 343 |
result_summary = f'status={result["status_code"]} body={body_preview}'
|
| 344 |
else:
|
| 345 |
result_summary = str(result)[:300]
|
|
@@ -351,18 +525,23 @@ def build_user_prompt(task_desc: str, app_base_url: str, step: int,
|
|
| 351 |
session_str = json.dumps(session_state, indent=2)[:500] if session_state else "{}"
|
| 352 |
last_result_str = _format_result_for_context(last_result)
|
| 353 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 354 |
return textwrap.dedent(f"""
|
| 355 |
TASK: {task_desc}
|
| 356 |
APP URL: {app_base_url}
|
| 357 |
STEP: {step}/{MAX_STEPS}
|
| 358 |
|
| 359 |
-
SESSION STATE (cookies/tokens auto-managed):
|
| 360 |
{session_str}
|
| 361 |
|
| 362 |
LAST TOOL RESULT:
|
| 363 |
{last_result_str}
|
| 364 |
|
| 365 |
-
HISTORY (
|
| 366 |
{chr(10).join(history_lines) if history_lines else " (none yet)"}
|
| 367 |
|
| 368 |
What is your next tool call? Output ONLY the JSON object.
|
|
@@ -384,51 +563,86 @@ def get_model_action(client: OpenAI, task_desc: str, app_base_url: str,
|
|
| 384 |
"X-Title": "HARvestGym",
|
| 385 |
}
|
| 386 |
|
| 387 |
-
|
| 388 |
-
|
| 389 |
-
|
| 390 |
-
|
| 391 |
-
model=MODEL_NAME,
|
| 392 |
-
messages=[
|
| 393 |
-
{"role": "system", "content": SYSTEM_PROMPT},
|
| 394 |
-
{"role": "user", "content": user_prompt},
|
| 395 |
-
],
|
| 396 |
-
tools=TOOLS,
|
| 397 |
-
tool_choice="required",
|
| 398 |
-
temperature=TEMPERATURE,
|
| 399 |
-
max_tokens=MAX_TOKENS,
|
| 400 |
-
stream=False,
|
| 401 |
-
extra_headers=extra_headers if extra_headers else None,
|
| 402 |
-
)
|
| 403 |
|
| 404 |
-
|
| 405 |
-
|
| 406 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 407 |
if step == 1:
|
| 408 |
return {"tool": "browser_agent", "args": {"task": task_desc, "url": app_base_url}}
|
| 409 |
-
return {"tool": "
|
| 410 |
-
|
| 411 |
-
# Native tool call response (preferred — gives us structured args directly)
|
| 412 |
-
if choice.message.tool_calls:
|
| 413 |
-
tc = choice.message.tool_calls[0]
|
| 414 |
-
tool_name = tc.function.name
|
| 415 |
-
try:
|
| 416 |
-
args = json.loads(tc.function.arguments)
|
| 417 |
-
except json.JSONDecodeError:
|
| 418 |
-
args = {}
|
| 419 |
-
print(f"[DEBUG] Tool call: {tool_name}({list(args.keys())})", flush=True)
|
| 420 |
-
return {"tool": tool_name, "args": args}
|
| 421 |
|
| 422 |
-
|
| 423 |
-
|
| 424 |
-
|
| 425 |
-
return
|
| 426 |
-
|
| 427 |
-
except Exception as exc:
|
| 428 |
-
print(f"[DEBUG] LLM call failed at step {step}: {exc}", flush=True)
|
| 429 |
-
if step == 1:
|
| 430 |
-
return {"tool": "browser_agent", "args": {"task": task_desc, "url": app_base_url}}
|
| 431 |
-
return {"tool": "done", "args": {"result": f"LLM error: {exc}"}}
|
| 432 |
|
| 433 |
|
| 434 |
def _parse_text_fallback(text: str, step: int, task_desc: str, app_base_url: str) -> dict:
|
|
@@ -451,9 +665,12 @@ def _parse_text_fallback(text: str, step: int, task_desc: str, app_base_url: str
|
|
| 451 |
print(f"[DEBUG] Text fallback failed: {text[:200]}", flush=True)
|
| 452 |
if step == 1:
|
| 453 |
return {"tool": "browser_agent", "args": {"task": task_desc, "url": app_base_url}}
|
| 454 |
-
|
|
|
|
|
|
|
| 455 |
return {"tool": "done", "args": {}}
|
| 456 |
-
|
|
|
|
| 457 |
|
| 458 |
|
| 459 |
# ---------------------------------------------------------------------------
|
|
@@ -471,9 +688,20 @@ async def run_episode(task_config: dict, client: OpenAI) -> dict:
|
|
| 471 |
template_id = task_config["template_id"]
|
| 472 |
task_description = task_config["description"]
|
| 473 |
app_base_url = task_config["app_base_url"]
|
| 474 |
-
|
| 475 |
-
|
| 476 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 477 |
|
| 478 |
env = HARvestGymEnvironment()
|
| 479 |
|
|
@@ -489,11 +717,15 @@ async def run_episode(task_config: dict, client: OpenAI) -> dict:
|
|
| 489 |
|
| 490 |
try:
|
| 491 |
obs = env.reset()
|
| 492 |
-
#
|
| 493 |
-
#
|
| 494 |
task_desc = obs.task or task_description
|
| 495 |
base_url = obs.app_base_url or app_base_url
|
| 496 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 497 |
for step in range(1, MAX_STEPS + 1):
|
| 498 |
if getattr(obs, "done", False):
|
| 499 |
break
|
|
@@ -523,6 +755,12 @@ async def run_episode(task_config: dict, client: OpenAI) -> dict:
|
|
| 523 |
last_result = obs.last_tool_result
|
| 524 |
session_state = dict(obs.session_state or {})
|
| 525 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 526 |
history.append({
|
| 527 |
"step": step,
|
| 528 |
"tool": tool,
|
|
@@ -534,6 +772,7 @@ async def run_episode(task_config: dict, client: OpenAI) -> dict:
|
|
| 534 |
reward = -0.1
|
| 535 |
done = False
|
| 536 |
error_str = str(exc)[:200]
|
|
|
|
| 537 |
|
| 538 |
rewards.append(reward)
|
| 539 |
steps_taken = step
|
|
@@ -546,19 +785,32 @@ async def run_episode(task_config: dict, client: OpenAI) -> dict:
|
|
| 546 |
# Reward range by design: terminal success = +2 to +5, terminal fail = -1.5
|
| 547 |
# Use a generous baseline so partial credit shows up.
|
| 548 |
total_reward = sum(rewards)
|
| 549 |
-
#
|
| 550 |
-
# Template 1 max=2, Template 3 max=3.5, Template 6 max=5
|
| 551 |
-
|
|
|
|
|
|
|
|
|
|
| 552 |
success = total_reward >= 0.5 # any positive terminal reward = success
|
| 553 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 554 |
except Exception as exc:
|
| 555 |
error_str = str(exc)[:200]
|
| 556 |
print(f"[DEBUG] Episode error: {error_str}", flush=True)
|
| 557 |
finally:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 558 |
log_end(success=success, steps=steps_taken, score=score, rewards=rewards)
|
| 559 |
|
| 560 |
return {
|
| 561 |
"task_name": task_name,
|
|
|
|
|
|
|
| 562 |
"success": success,
|
| 563 |
"steps": steps_taken,
|
| 564 |
"score": score,
|
|
@@ -574,21 +826,38 @@ async def main() -> None:
|
|
| 574 |
client = OpenAI(base_url=API_BASE_URL, api_key=API_KEY)
|
| 575 |
|
| 576 |
results = []
|
| 577 |
-
for task_config in TASKS:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 578 |
result = await run_episode(task_config, client)
|
| 579 |
results.append(result)
|
| 580 |
-
|
| 581 |
-
# Summary
|
| 582 |
-
print("\n[SUMMARY]", flush=True)
|
| 583 |
-
for r in results:
|
| 584 |
-
status = "PASS" if r["success"] else "FAIL"
|
| 585 |
print(
|
| 586 |
-
f" [{status}]
|
| 587 |
flush=True,
|
| 588 |
)
|
| 589 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 590 |
overall_score = sum(r["score"] for r in results) / len(results) if results else 0.0
|
| 591 |
-
print(f"\n
|
|
|
|
| 592 |
|
| 593 |
|
| 594 |
if __name__ == "__main__":
|
|
|
|
| 29 |
import asyncio
|
| 30 |
import json
|
| 31 |
import os
|
| 32 |
+
import re
|
| 33 |
import sys
|
| 34 |
import textwrap
|
| 35 |
+
from pathlib import Path
|
| 36 |
from typing import Any, List, Optional
|
| 37 |
|
| 38 |
from openai import OpenAI
|
| 39 |
|
| 40 |
+
# ---------------------------------------------------------------------------
|
| 41 |
+
# Verbose mode — set VERBOSE=1 for detailed per-step debugging.
|
| 42 |
+
# Keep disabled (default) for hackathon submission to avoid stdout noise.
|
| 43 |
+
# ---------------------------------------------------------------------------
|
| 44 |
+
|
| 45 |
+
VERBOSE = os.getenv("VERBOSE", "0").strip() == "1"
|
| 46 |
+
|
| 47 |
+
|
| 48 |
+
def vprint(*args) -> None:
|
| 49 |
+
"""Print only when VERBOSE=1."""
|
| 50 |
+
if VERBOSE:
|
| 51 |
+
print(*args, flush=True)
|
| 52 |
+
|
| 53 |
+
|
| 54 |
+
def vdump(label: str, obj: Any, max_chars: int = 2000) -> None:
|
| 55 |
+
"""Pretty-print a labelled object when verbose."""
|
| 56 |
+
if not VERBOSE:
|
| 57 |
+
return
|
| 58 |
+
try:
|
| 59 |
+
text = json.dumps(obj, indent=2)
|
| 60 |
+
except Exception:
|
| 61 |
+
text = str(obj)
|
| 62 |
+
if len(text) > max_chars:
|
| 63 |
+
text = text[:max_chars] + f"\n... [truncated {len(text)-max_chars} chars]"
|
| 64 |
+
print(f"\n{'─'*60}\n[VERBOSE] {label}\n{'─'*60}\n{text}\n", flush=True)
|
| 65 |
+
|
| 66 |
+
|
| 67 |
# ---------------------------------------------------------------------------
|
| 68 |
# Configuration — auto-detect provider from env vars
|
| 69 |
# ---------------------------------------------------------------------------
|
| 70 |
|
| 71 |
+
HF_TOKEN = os.getenv("HF_TOKEN")
|
| 72 |
+
if not HF_TOKEN:
|
| 73 |
+
raise ValueError(
|
| 74 |
+
"HF_TOKEN is required but not set.\n"
|
| 75 |
+
"Usage: HF_TOKEN=hf_xxx uv run inference.py"
|
| 76 |
+
)
|
| 77 |
|
| 78 |
+
_OPENROUTER_KEY = os.getenv("OPENROUTER_API_KEY")
|
| 79 |
if _OPENROUTER_KEY:
|
| 80 |
+
# OpenRouter mode — useful for local testing with alternative models
|
| 81 |
API_BASE_URL = os.getenv("API_BASE_URL", "https://openrouter.ai/api/v1")
|
| 82 |
API_KEY = _OPENROUTER_KEY
|
| 83 |
MODEL_NAME = os.getenv("MODEL_NAME", "google/gemma-4-31b-it")
|
|
|
|
| 84 |
print(f"[INFO] Provider: OpenRouter | Model: {MODEL_NAME}", flush=True)
|
| 85 |
+
else:
|
| 86 |
# HuggingFace Inference Router — final submission target
|
| 87 |
API_BASE_URL = os.getenv("API_BASE_URL", "https://router.huggingface.co/v1")
|
| 88 |
+
API_KEY = HF_TOKEN
|
| 89 |
+
MODEL_NAME = os.getenv("MODEL_NAME", "google/gemma-4-31B-it")
|
|
|
|
| 90 |
print(f"[INFO] Provider: HuggingFace | Model: {MODEL_NAME}", flush=True)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
|
| 92 |
# ---------------------------------------------------------------------------
|
| 93 |
# Tool definitions — proper OpenAI function-calling format.
|
|
|
|
| 105 |
"function": {
|
| 106 |
"name": "browser_agent",
|
| 107 |
"description": (
|
| 108 |
+
"Discovers API endpoints available on the target web application by "
|
| 109 |
+
"replaying real browser traffic recorded in HAR files. Returns a "
|
| 110 |
+
"structured index of observed endpoints with HTTP methods, paths, "
|
| 111 |
+
"request/response schemas, and headers (including any auth headers seen). "
|
| 112 |
+
"Call this ONCE at step 1 to build the endpoint index. Do not call again."
|
|
|
|
| 113 |
),
|
| 114 |
"parameters": {
|
| 115 |
"type": "object",
|
| 116 |
"properties": {
|
| 117 |
"task": {
|
| 118 |
"type": "string",
|
| 119 |
+
"description": "The task you need to accomplish (used to prioritise relevant endpoints)",
|
| 120 |
},
|
| 121 |
"url": {
|
| 122 |
"type": "string",
|
| 123 |
+
"description": "Base URL of the target application",
|
| 124 |
},
|
| 125 |
},
|
| 126 |
"required": ["task", "url"],
|
|
|
|
| 134 |
"function": {
|
| 135 |
"name": "search_endpoints",
|
| 136 |
"description": (
|
| 137 |
+
"Semantic search over the endpoints and it's details found by the browser_agent. "
|
| 138 |
+
"Returns matching endpoint schemas: HTTP method, full path, required parameters, "
|
| 139 |
+
"authentication requirements (bearer token, cookie, etc.), and example payloads. "
|
| 140 |
+
"Use this before every curl_exec call to confirm the correct endpoint shape. "
|
|
|
|
|
|
|
|
|
|
| 141 |
),
|
| 142 |
"parameters": {
|
| 143 |
"type": "object",
|
| 144 |
"properties": {
|
| 145 |
"query": {
|
| 146 |
"type": "string",
|
| 147 |
+
"description": "Natural language description of the operation you need "
|
| 148 |
+
"(e.g. 'authenticate user', 'list products in category', "
|
| 149 |
+
"'add item to cart', 'place order')",
|
| 150 |
},
|
| 151 |
},
|
| 152 |
"required": ["query"],
|
|
|
|
| 160 |
"function": {
|
| 161 |
"name": "curl_exec",
|
| 162 |
"description": (
|
| 163 |
+
"Execute an HTTP request against the live application and return the response. "
|
| 164 |
+
"Response contains: status_code, headers, body. "
|
| 165 |
+
"For HTML pages, body is a structured summary: page title, forms with action URLs "
|
| 166 |
+
"and field values (product IDs, form_key, etc.), and visible text. "
|
| 167 |
+
"IMPORTANT: When the body shows '[Forms — N found]' with POST actions containing "
|
| 168 |
+
"'/checkout/cart/add/...', the 'product' field IS the product ID and the action "
|
| 169 |
+
"URL IS the add-to-cart endpoint — use these directly without calling "
|
| 170 |
+
"search_episode_data again. "
|
| 171 |
+
"Session state (cookies, auth tokens) is automatically managed — previously "
|
| 172 |
+
"obtained tokens are injected into subsequent requests automatically. "
|
| 173 |
+
"If the response is truncated or you need a value from an earlier response, "
|
| 174 |
+
"use search_episode_data."
|
| 175 |
),
|
| 176 |
"parameters": {
|
| 177 |
"type": "object",
|
|
|
|
| 179 |
"command": {
|
| 180 |
"type": "string",
|
| 181 |
"description": (
|
| 182 |
+
"Full curl command string (use -s for silent mode). "
|
| 183 |
+
"Include -H 'Content-Type: application/json' for POST/PUT/PATCH. "
|
| 184 |
+
"Example: curl -s -X POST 'http://host/api/endpoint' "
|
| 185 |
+
"-H 'Content-Type: application/json' -d '{\"key\":\"value\"}'"
|
|
|
|
|
|
|
| 186 |
),
|
| 187 |
},
|
| 188 |
},
|
|
|
|
| 197 |
"function": {
|
| 198 |
"name": "search_episode_data",
|
| 199 |
"description": (
|
| 200 |
+
"Semantic search over all API responses collected during this episode. "
|
| 201 |
+
"Full response bodies are stored untruncated — this tool finds the right "
|
| 202 |
+
"response and returns a compact preview with a note showing the total "
|
| 203 |
+
"number of matching objects (e.g. '47 items total — showing first 3'). "
|
| 204 |
+
"Use more specific queries to drill into a particular value. "
|
| 205 |
+
"Examples: 'id for category Gear', 'SKU for Radiant Tee', "
|
| 206 |
+
"'cart id', 'authentication token', 'order id after checkout'."
|
| 207 |
),
|
| 208 |
"parameters": {
|
| 209 |
"type": "object",
|
| 210 |
"properties": {
|
| 211 |
"query": {
|
| 212 |
"type": "string",
|
| 213 |
+
"description": "What you are looking for in the response history of the curl commands you executed "
|
| 214 |
+
"(e.g. 'category id for Pants', 'cart id', 'token')",
|
| 215 |
},
|
| 216 |
},
|
| 217 |
"required": ["query"],
|
|
|
|
| 225 |
"function": {
|
| 226 |
"name": "done",
|
| 227 |
"description": (
|
| 228 |
+
"Signal that the task is complete and trigger final scoring. "
|
| 229 |
+
"Call this immediately after the response that fulfills the task objective. "
|
| 230 |
+
"Do not make further API calls once the goal is met — call done() next."
|
|
|
|
| 231 |
),
|
| 232 |
"parameters": {
|
| 233 |
"type": "object",
|
| 234 |
"properties": {
|
| 235 |
"result": {
|
| 236 |
"type": "string",
|
| 237 |
+
"description": "Brief summary of what was accomplished",
|
| 238 |
},
|
| 239 |
},
|
| 240 |
"additionalProperties": False,
|
| 241 |
},
|
| 242 |
+
"strict": False,
|
| 243 |
},
|
| 244 |
},
|
| 245 |
]
|
| 246 |
|
| 247 |
BENCHMARK = "harvgym"
|
| 248 |
MAX_STEPS = 20
|
| 249 |
+
TEMPERATURE = 0.2
|
| 250 |
+
MAX_TOKENS = 64000
|
| 251 |
SUCCESS_SCORE_THRESHOLD = 0.5
|
| 252 |
|
| 253 |
+
# ---------------------------------------------------------------------------
|
| 254 |
+
# Task bank — 5 easy (T1: list products), 5 medium (T3: add to cart),
|
| 255 |
+
# 5 hard (T6: guest checkout).
|
| 256 |
+
#
|
| 257 |
+
# For hackathon submission only the first easy/medium/hard is run.
|
| 258 |
+
# Full evaluation runs all 15 sequentially to measure generalisation.
|
| 259 |
+
# ---------------------------------------------------------------------------
|
| 260 |
+
_SHOP = "http://ec2-16-59-2-56.us-east-2.compute.amazonaws.com:7770/"
|
| 261 |
+
|
| 262 |
+
def _load_parameter_pools_for_tasks() -> dict:
|
| 263 |
+
pools_path = Path(__file__).with_name("parameter_pools.json")
|
| 264 |
+
with open(pools_path) as f:
|
| 265 |
+
return json.load(f)
|
| 266 |
+
|
| 267 |
+
|
| 268 |
+
_TASK_PARAMETER_POOLS = _load_parameter_pools_for_tasks()
|
| 269 |
+
|
| 270 |
+
|
| 271 |
+
def _lookup_category_params(category_name: str) -> dict:
|
| 272 |
+
categories = _TASK_PARAMETER_POOLS.get("template_1", {}).get("pool", {}).get("category_name", [])
|
| 273 |
+
for item in categories:
|
| 274 |
+
if item.get("name") == category_name:
|
| 275 |
+
return {
|
| 276 |
+
"category_name": item["name"],
|
| 277 |
+
"category_id": item.get("category_id"),
|
| 278 |
+
}
|
| 279 |
+
raise ValueError(f"Unknown category in parameter_pools.json: {category_name}")
|
| 280 |
+
|
| 281 |
+
|
| 282 |
+
def _lookup_product_params(product_name: str, template_id: int) -> dict:
|
| 283 |
+
products = _TASK_PARAMETER_POOLS.get(f"template_{template_id}", {}).get("pool", {}).get("product_name", [])
|
| 284 |
+
for item in products:
|
| 285 |
+
if item.get("name") == product_name:
|
| 286 |
+
return {
|
| 287 |
+
"product_name": item["name"],
|
| 288 |
+
"sku": item.get("sku", ""),
|
| 289 |
+
"product_id": item.get("product_id"),
|
| 290 |
+
}
|
| 291 |
+
raise ValueError(
|
| 292 |
+
f"Unknown product in parameter_pools.json for template {template_id}: {product_name}"
|
| 293 |
+
)
|
| 294 |
+
|
| 295 |
+
|
| 296 |
+
def _make_easy_task(task_name: str, category_name: str) -> dict:
|
| 297 |
+
return {
|
| 298 |
+
"task_name": task_name,
|
| 299 |
"template_id": 1,
|
|
|
|
|
|
|
| 300 |
"difficulty": "easy",
|
| 301 |
+
"description": f"List products in the '{category_name}' category",
|
| 302 |
+
"app_base_url": _SHOP,
|
| 303 |
+
"task_params": _lookup_category_params(category_name),
|
| 304 |
+
}
|
| 305 |
+
|
| 306 |
+
|
| 307 |
+
def _make_product_task(task_name: str, template_id: int, difficulty: str,
|
| 308 |
+
description: str, product_name: str) -> dict:
|
| 309 |
+
return {
|
| 310 |
+
"task_name": task_name,
|
| 311 |
+
"template_id": template_id,
|
| 312 |
+
"difficulty": difficulty,
|
| 313 |
+
"description": description,
|
| 314 |
+
"app_base_url": _SHOP,
|
| 315 |
+
"task_params": _lookup_product_params(product_name, template_id),
|
| 316 |
+
}
|
| 317 |
+
|
| 318 |
+
|
| 319 |
+
TASKS_EASY = [
|
| 320 |
+
_make_easy_task("easy_list_pants", "Pants"),
|
| 321 |
+
_make_easy_task("easy_list_bags", "Bags"),
|
| 322 |
+
_make_easy_task("easy_list_jackets", "Jackets"),
|
| 323 |
+
_make_easy_task("easy_list_hoodies", "Hoodies"),
|
| 324 |
+
_make_easy_task("easy_list_shoes", "Shoes"),
|
| 325 |
]
|
| 326 |
|
| 327 |
+
TASKS_MEDIUM = [
|
| 328 |
+
_make_product_task(
|
| 329 |
+
"medium_cart_camera_backpack",
|
| 330 |
+
3,
|
| 331 |
+
"medium",
|
| 332 |
+
"Add 'Camera Backpack Bagsmar DSLR Waterproof' to a guest cart",
|
| 333 |
+
"Camera Backpack Bagsmar DSLR Waterproof",
|
| 334 |
+
),
|
| 335 |
+
_make_product_task(
|
| 336 |
+
"medium_cart_flannel_jacket",
|
| 337 |
+
3,
|
| 338 |
+
"medium",
|
| 339 |
+
"Add 'Noldares Flannel Jacket For Men Plaid' to a guest cart",
|
| 340 |
+
"Noldares Flannel Jacket For Men Plaid",
|
| 341 |
+
),
|
| 342 |
+
_make_product_task(
|
| 343 |
+
"medium_cart_champion_hoodie",
|
| 344 |
+
3,
|
| 345 |
+
"medium",
|
| 346 |
+
"Add 'Champion Hoodie Big And Tall Zip Up' to a guest cart",
|
| 347 |
+
"Champion Hoodie Big And Tall Zip Up",
|
| 348 |
+
),
|
| 349 |
+
_make_product_task(
|
| 350 |
+
"medium_cart_cargo_pants",
|
| 351 |
+
3,
|
| 352 |
+
"medium",
|
| 353 |
+
"Add 'Mens Slim Fit Cargo Pants Athletic' to a guest cart",
|
| 354 |
+
"Mens Slim Fit Cargo Pants Athletic",
|
| 355 |
+
),
|
| 356 |
+
_make_product_task(
|
| 357 |
+
"medium_cart_leather_jacket",
|
| 358 |
+
3,
|
| 359 |
+
"medium",
|
| 360 |
+
"Add 'Inesver Womens Leather Jacket Open Front' to a guest cart",
|
| 361 |
+
"Inesver Womens Leather Jacket Open Front",
|
| 362 |
+
),
|
| 363 |
+
]
|
| 364 |
+
|
| 365 |
+
TASKS_HARD = [
|
| 366 |
+
_make_product_task(
|
| 367 |
+
"hard_checkout_ripstop_pants",
|
| 368 |
+
6,
|
| 369 |
+
"hard",
|
| 370 |
+
"Complete a full guest checkout for 'Mens Ripstop Cargo Pants Tactical Hiking'",
|
| 371 |
+
"Mens Ripstop Cargo Pants Tactical Hiking",
|
| 372 |
+
),
|
| 373 |
+
_make_product_task(
|
| 374 |
+
"hard_checkout_flannel_jacket",
|
| 375 |
+
6,
|
| 376 |
+
"hard",
|
| 377 |
+
"Complete a full guest checkout for 'Noldares Flannel Jacket For Men Plaid'",
|
| 378 |
+
"Noldares Flannel Jacket For Men Plaid",
|
| 379 |
+
),
|
| 380 |
+
_make_product_task(
|
| 381 |
+
"hard_checkout_champion_hoodie",
|
| 382 |
+
6,
|
| 383 |
+
"hard",
|
| 384 |
+
"Complete a full guest checkout for 'Champion Hoodie Big And Tall Zip Up'",
|
| 385 |
+
"Champion Hoodie Big And Tall Zip Up",
|
| 386 |
+
),
|
| 387 |
+
_make_product_task(
|
| 388 |
+
"hard_checkout_fleece_jacket",
|
| 389 |
+
6,
|
| 390 |
+
"hard",
|
| 391 |
+
"Complete a full guest checkout for 'Womens Fleece Jacket With Hood Winter'",
|
| 392 |
+
"Womens Fleece Jacket With Hood Winter",
|
| 393 |
+
),
|
| 394 |
+
_make_product_task(
|
| 395 |
+
"hard_checkout_totes_boots",
|
| 396 |
+
6,
|
| 397 |
+
"hard",
|
| 398 |
+
"Complete a full guest checkout for 'Totes Womens Cold Weather Boots Nicole'",
|
| 399 |
+
"Totes Womens Cold Weather Boots Nicole",
|
| 400 |
+
),
|
| 401 |
+
]
|
| 402 |
+
|
| 403 |
+
# Default: first of each tier (hackathon submission format)
|
| 404 |
+
TASKS = [ TASKS_EASY[0], TASKS_MEDIUM[0], TASKS_MEDIUM[1], TASKS_HARD[0]]
|
| 405 |
+
|
| 406 |
+
# Set EVAL_MODE=full to run all 1By default, we have three tasks.5; EVAL_MODE=easy/medium/hard to run only that tier
|
| 407 |
+
_EVAL_MODE = os.getenv("EVAL_MODE", "").strip().lower()
|
| 408 |
+
if _EVAL_MODE == "full":
|
| 409 |
+
TASKS = TASKS_EASY + TASKS_MEDIUM + TASKS_HARD
|
| 410 |
+
elif _EVAL_MODE == "easy":
|
| 411 |
+
TASKS = TASKS_EASY
|
| 412 |
+
elif _EVAL_MODE == "one":
|
| 413 |
+
TASKS = [TASKS_MEDIUM[1]]
|
| 414 |
+
elif _EVAL_MODE == "medium":
|
| 415 |
+
TASKS = TASKS_MEDIUM
|
| 416 |
+
elif _EVAL_MODE == "hard":
|
| 417 |
+
TASKS = TASKS_HARD
|
| 418 |
+
|
| 419 |
# ---------------------------------------------------------------------------
|
| 420 |
# Logging helpers (hackathon format)
|
| 421 |
# ---------------------------------------------------------------------------
|
|
|
|
| 447 |
# ---------------------------------------------------------------------------
|
| 448 |
|
| 449 |
SYSTEM_PROMPT = textwrap.dedent("""
|
| 450 |
+
You are an API agent. Your goal is to complete a real-world task on a live web application
|
| 451 |
+
by calling its HTTP APIs in the correct order using the tools provided.
|
| 452 |
|
| 453 |
WORKFLOW:
|
| 454 |
+
1. Call browser_agent once at step 1 to build an index of the application's endpoints.
|
| 455 |
+
2. Use search_endpoints before each API call to find the correct path, method, and required parameters.
|
| 456 |
+
3. Execute HTTP requests with curl_exec in the correct dependency order. Read every response
|
| 457 |
+
carefully — IDs, tokens, and error messages in responses are required inputs for (or
|
| 458 |
+
corrective signals for) subsequent calls.
|
| 459 |
+
4. If a prior response contains a value you need now, use search_episode_data to retrieve it.
|
| 460 |
+
5. Call done() as soon as the task objective is met.
|
| 461 |
+
|
| 462 |
+
PRINCIPLES:
|
| 463 |
+
- Always discover before you act: browser_agent first, then search_endpoints.
|
| 464 |
+
- Extract every ID, token, and key from API responses and use them in subsequent calls.
|
| 465 |
+
- If a request returns an auth error, find and call the auth endpoint first, then retry.
|
| 466 |
+
- Never fabricate IDs or values — they must come from actual API responses.
|
| 467 |
+
- Once the task is done, call done() immediately — do not make additional calls.
|
| 468 |
+
- Some tasks require a sequence of dependent API calls where the output of one call
|
| 469 |
+
(an ID, token, or key) is the required input to the next. Identify these dependencies
|
| 470 |
+
before acting: plan the call sequence, then execute step by step.
|
| 471 |
+
- Never call the same endpoint repeatedly hoping for a different result. If a call already
|
| 472 |
+
succeeded, move on to the next step. Repeating the same call wastes steps and incurs a
|
| 473 |
+
penalty.
|
| 474 |
+
- Do not brute-force or vary parameters at random. If a call fails, read the error message
|
| 475 |
+
in LAST TOOL RESULT, diagnose the cause logically, and use that understanding to form the
|
| 476 |
+
correct next request.
|
| 477 |
+
- If you are partway through a multi-step task and a required ID or token is missing, use
|
| 478 |
+
search_episode_data to retrieve it from an earlier response before making a new call.
|
| 479 |
""").strip()
|
| 480 |
|
| 481 |
|
|
|
|
| 510 |
"""Build the user prompt for each step."""
|
| 511 |
history_lines = []
|
| 512 |
if history:
|
| 513 |
+
for h in history:
|
|
|
|
| 514 |
result = h.get("result", {})
|
|
|
|
| 515 |
if isinstance(result, dict) and "status_code" in result:
|
| 516 |
+
body_preview = str(result.get("body", ""))[:800]
|
| 517 |
result_summary = f'status={result["status_code"]} body={body_preview}'
|
| 518 |
else:
|
| 519 |
result_summary = str(result)[:300]
|
|
|
|
| 525 |
session_str = json.dumps(session_state, indent=2)[:500] if session_state else "{}"
|
| 526 |
last_result_str = _format_result_for_context(last_result)
|
| 527 |
|
| 528 |
+
# Highlight form_key if available — it's needed for HTML form POSTs
|
| 529 |
+
form_key_hint = ""
|
| 530 |
+
if session_state.get("form_key"):
|
| 531 |
+
form_key_hint = f"\nFORM_KEY (auto-extracted, use in POST body): {session_state['form_key']}"
|
| 532 |
+
|
| 533 |
return textwrap.dedent(f"""
|
| 534 |
TASK: {task_desc}
|
| 535 |
APP URL: {app_base_url}
|
| 536 |
STEP: {step}/{MAX_STEPS}
|
| 537 |
|
| 538 |
+
SESSION STATE (cookies/tokens auto-managed):{form_key_hint}
|
| 539 |
{session_str}
|
| 540 |
|
| 541 |
LAST TOOL RESULT:
|
| 542 |
{last_result_str}
|
| 543 |
|
| 544 |
+
HISTORY (all {len(history_lines)} steps so far):
|
| 545 |
{chr(10).join(history_lines) if history_lines else " (none yet)"}
|
| 546 |
|
| 547 |
What is your next tool call? Output ONLY the JSON object.
|
|
|
|
| 563 |
"X-Title": "HARvestGym",
|
| 564 |
}
|
| 565 |
|
| 566 |
+
vprint(f"\n{'═'*60}")
|
| 567 |
+
vprint(f"[VERBOSE] === LLM CALL — step {step} ===")
|
| 568 |
+
vdump("SYSTEM PROMPT", SYSTEM_PROMPT)
|
| 569 |
+
vdump("USER PROMPT", user_prompt)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 570 |
|
| 571 |
+
# Retry loop — backs off on 429 rate limits, never calls done() on a transient error
|
| 572 |
+
_MAX_RETRIES = 3
|
| 573 |
+
_BASE_DELAY = 3 # seconds before first retry
|
| 574 |
+
for _attempt in range(_MAX_RETRIES):
|
| 575 |
+
try:
|
| 576 |
+
completion = client.chat.completions.create(
|
| 577 |
+
model=MODEL_NAME,
|
| 578 |
+
messages=[
|
| 579 |
+
{"role": "system", "content": SYSTEM_PROMPT},
|
| 580 |
+
{"role": "user", "content": user_prompt},
|
| 581 |
+
],
|
| 582 |
+
tools=TOOLS,
|
| 583 |
+
tool_choice="required",
|
| 584 |
+
temperature=TEMPERATURE,
|
| 585 |
+
max_tokens=MAX_TOKENS,
|
| 586 |
+
stream=False,
|
| 587 |
+
extra_headers=extra_headers if extra_headers else None,
|
| 588 |
+
)
|
| 589 |
+
|
| 590 |
+
choice = completion.choices[0] if completion.choices else None
|
| 591 |
+
|
| 592 |
+
vdump(f"RAW COMPLETION (step {step}, attempt {_attempt+1})", {
|
| 593 |
+
"finish_reason": choice.finish_reason if choice else None,
|
| 594 |
+
"usage": dict(completion.usage) if hasattr(completion, "usage") and completion.usage else None,
|
| 595 |
+
"message_content": choice.message.content if choice else None,
|
| 596 |
+
"tool_calls_count": len(choice.message.tool_calls or []) if choice else 0,
|
| 597 |
+
})
|
| 598 |
+
|
| 599 |
+
# Detect null/empty completion (upstream rate limit without a 429 status)
|
| 600 |
+
if choice is None or (
|
| 601 |
+
choice.finish_reason is None
|
| 602 |
+
and not (choice.message.tool_calls or (choice.message.content or "").strip())
|
| 603 |
+
):
|
| 604 |
+
wait = _BASE_DELAY * (2 ** _attempt)
|
| 605 |
+
print(f"[DEBUG] Null completion at step {step} (attempt {_attempt+1}/{_MAX_RETRIES}) — waiting {wait}s", flush=True)
|
| 606 |
+
import time; time.sleep(wait)
|
| 607 |
+
continue # retry
|
| 608 |
+
|
| 609 |
+
# Native tool call (preferred)
|
| 610 |
+
if choice.message.tool_calls:
|
| 611 |
+
tc = choice.message.tool_calls[0]
|
| 612 |
+
tool_name = tc.function.name
|
| 613 |
+
try:
|
| 614 |
+
args = json.loads(tc.function.arguments)
|
| 615 |
+
except json.JSONDecodeError:
|
| 616 |
+
args = {}
|
| 617 |
+
print(f"[DEBUG] Tool call: {tool_name}({list(args.keys())})", flush=True)
|
| 618 |
+
vdump(f"TOOL CALL ARGS — {tool_name}", args)
|
| 619 |
+
return {"tool": tool_name, "args": args}
|
| 620 |
+
|
| 621 |
+
# Plain-text fallback (some providers ignore tool_choice="required")
|
| 622 |
+
text = (choice.message.content or "").strip()
|
| 623 |
+
print(f"[DEBUG] No tool_calls in response, trying text parse: {text[:100]}", flush=True)
|
| 624 |
+
vprint(f"[VERBOSE] Full text response: {text}")
|
| 625 |
+
return _parse_text_fallback(text, step, task_desc, app_base_url)
|
| 626 |
+
|
| 627 |
+
except Exception as exc:
|
| 628 |
+
exc_str = str(exc)
|
| 629 |
+
is_rate_limit = "429" in exc_str or "rate" in exc_str.lower() or "Rate" in exc_str
|
| 630 |
+
if is_rate_limit and _attempt < _MAX_RETRIES - 1:
|
| 631 |
+
wait = _BASE_DELAY * (2 ** _attempt)
|
| 632 |
+
print(f"[DEBUG] Rate-limited at step {step} (attempt {_attempt+1}/{_MAX_RETRIES}) — waiting {wait}s then retrying", flush=True)
|
| 633 |
+
import time; time.sleep(wait)
|
| 634 |
+
continue # retry
|
| 635 |
+
# Non-rate-limit error or exhausted retries — don't call done(), keep episode alive
|
| 636 |
+
print(f"[DEBUG] LLM call failed at step {step} (attempt {_attempt+1}): {exc}", flush=True)
|
| 637 |
if step == 1:
|
| 638 |
return {"tool": "browser_agent", "args": {"task": task_desc, "url": app_base_url}}
|
| 639 |
+
return {"tool": "search_endpoints", "args": {"query": "available API endpoints"}}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 640 |
|
| 641 |
+
# Exhausted all retries — nudge forward without ending the episode
|
| 642 |
+
print(f"[DEBUG] All {_MAX_RETRIES} retries exhausted at step {step} — nudging with search_endpoints", flush=True)
|
| 643 |
+
if step == 1:
|
| 644 |
+
return {"tool": "browser_agent", "args": {"task": task_desc, "url": app_base_url}}
|
| 645 |
+
return {"tool": "search_endpoints", "args": {"query": "available API endpoints"}}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 646 |
|
| 647 |
|
| 648 |
def _parse_text_fallback(text: str, step: int, task_desc: str, app_base_url: str) -> dict:
|
|
|
|
| 665 |
print(f"[DEBUG] Text fallback failed: {text[:200]}", flush=True)
|
| 666 |
if step == 1:
|
| 667 |
return {"tool": "browser_agent", "args": {"task": task_desc, "url": app_base_url}}
|
| 668 |
+
# If the model explicitly says done, honour it — but only if text clearly indicates it.
|
| 669 |
+
# A bare parse error should NEVER call done() because that would trigger the judge early.
|
| 670 |
+
if re.search(r"\bdone\b", text.lower()) and len(text.strip()) < 80:
|
| 671 |
return {"tool": "done", "args": {}}
|
| 672 |
+
# Keep episode alive — nudge the model rather than punishing with a premature judge call.
|
| 673 |
+
return {"tool": "search_endpoints", "args": {"query": "available REST API endpoints"}}
|
| 674 |
|
| 675 |
|
| 676 |
# ---------------------------------------------------------------------------
|
|
|
|
| 688 |
template_id = task_config["template_id"]
|
| 689 |
task_description = task_config["description"]
|
| 690 |
app_base_url = task_config["app_base_url"]
|
| 691 |
+
task_params = dict(task_config.get("task_params") or {})
|
| 692 |
+
|
| 693 |
+
# Pin the exact task so env.reset() uses the intended category/product instead
|
| 694 |
+
# of sampling a random item from the template pool.
|
| 695 |
+
os.environ["HARVGYM_TASK"] = str(template_id)
|
| 696 |
+
os.environ["HARVGYM_TASK_SPEC_JSON"] = json.dumps(
|
| 697 |
+
{
|
| 698 |
+
"template_id": template_id,
|
| 699 |
+
"description": task_description,
|
| 700 |
+
"params": task_params,
|
| 701 |
+
"base_url": app_base_url,
|
| 702 |
+
"difficulty": task_config.get("difficulty", ""),
|
| 703 |
+
}
|
| 704 |
+
)
|
| 705 |
|
| 706 |
env = HARvestGymEnvironment()
|
| 707 |
|
|
|
|
| 717 |
|
| 718 |
try:
|
| 719 |
obs = env.reset()
|
| 720 |
+
# Use the env-provided task description, which now matches the exact task spec
|
| 721 |
+
# passed in above.
|
| 722 |
task_desc = obs.task or task_description
|
| 723 |
base_url = obs.app_base_url or app_base_url
|
| 724 |
|
| 725 |
+
vprint(f"\n{'═'*60}")
|
| 726 |
+
vprint(f"[VERBOSE] EPISODE START — {task_name}")
|
| 727 |
+
vdump("INITIAL OBSERVATION (from env.reset)", obs.__dict__ if hasattr(obs, "__dict__") else str(obs))
|
| 728 |
+
|
| 729 |
for step in range(1, MAX_STEPS + 1):
|
| 730 |
if getattr(obs, "done", False):
|
| 731 |
break
|
|
|
|
| 755 |
last_result = obs.last_tool_result
|
| 756 |
session_state = dict(obs.session_state or {})
|
| 757 |
|
| 758 |
+
vprint(f"\n[VERBOSE] ── step {step} result ──")
|
| 759 |
+
vdump(f"TOOL RESULT — {tool}", last_result)
|
| 760 |
+
vprint(f"[VERBOSE] reward={reward:.3f} done={done}")
|
| 761 |
+
if done:
|
| 762 |
+
vdump("FINAL OBS (done=True)", obs.__dict__ if hasattr(obs, "__dict__") else str(obs))
|
| 763 |
+
|
| 764 |
history.append({
|
| 765 |
"step": step,
|
| 766 |
"tool": tool,
|
|
|
|
| 772 |
reward = -0.1
|
| 773 |
done = False
|
| 774 |
error_str = str(exc)[:200]
|
| 775 |
+
vprint(f"[VERBOSE] Step {step} EXCEPTION: {exc}")
|
| 776 |
|
| 777 |
rewards.append(reward)
|
| 778 |
steps_taken = step
|
|
|
|
| 785 |
# Reward range by design: terminal success = +2 to +5, terminal fail = -1.5
|
| 786 |
# Use a generous baseline so partial credit shows up.
|
| 787 |
total_reward = sum(rewards)
|
| 788 |
+
# Score: normalize to [0, 1] using per-template terminal-reward ceiling.
|
| 789 |
+
# Template 1 (easy) max=2.0, Template 3 (medium) max=3.5, Template 6 (hard) max=5.0.
|
| 790 |
+
# Shift by +1.5 so that the fail reward (-1.5) maps to 0 and max maps to 1.
|
| 791 |
+
_TEMPLATE_REWARD_CEIL = {1: 2.0, 3: 3.5, 6: 5.0}
|
| 792 |
+
_reward_ceil = _TEMPLATE_REWARD_CEIL.get(task_config.get("template_id"), 5.0)
|
| 793 |
+
score = max(0.0, min(1.0, (total_reward + 1.5) / (_reward_ceil + 1.5)))
|
| 794 |
success = total_reward >= 0.5 # any positive terminal reward = success
|
| 795 |
|
| 796 |
+
vprint(f"\n[VERBOSE] ── episode end — {task_name} ──")
|
| 797 |
+
vprint(f"[VERBOSE] total_reward={total_reward:.3f} score={score:.3f} success={success}")
|
| 798 |
+
vprint(f"[VERBOSE] rewards per step: {[f'{r:.2f}' for r in rewards]}")
|
| 799 |
+
|
| 800 |
except Exception as exc:
|
| 801 |
error_str = str(exc)[:200]
|
| 802 |
print(f"[DEBUG] Episode error: {error_str}", flush=True)
|
| 803 |
finally:
|
| 804 |
+
try:
|
| 805 |
+
env.close()
|
| 806 |
+
except Exception as e:
|
| 807 |
+
print(f"[DEBUG] env.close() error: {e}", flush=True)
|
| 808 |
log_end(success=success, steps=steps_taken, score=score, rewards=rewards)
|
| 809 |
|
| 810 |
return {
|
| 811 |
"task_name": task_name,
|
| 812 |
+
"difficulty": task_config.get("difficulty", "unknown"),
|
| 813 |
+
"description": task_config.get("description", ""),
|
| 814 |
"success": success,
|
| 815 |
"steps": steps_taken,
|
| 816 |
"score": score,
|
|
|
|
| 826 |
client = OpenAI(base_url=API_BASE_URL, api_key=API_KEY)
|
| 827 |
|
| 828 |
results = []
|
| 829 |
+
for i, task_config in enumerate(TASKS, 1):
|
| 830 |
+
difficulty = task_config.get("difficulty", "")
|
| 831 |
+
desc = task_config.get("description", "")
|
| 832 |
+
print(
|
| 833 |
+
f"\n{'='*70}\n[TASK {i}/{len(TASKS)}] ({difficulty.upper()}) {desc}\n{'='*70}",
|
| 834 |
+
flush=True,
|
| 835 |
+
)
|
| 836 |
result = await run_episode(task_config, client)
|
| 837 |
results.append(result)
|
| 838 |
+
status = "PASS" if result["success"] else "FAIL"
|
|
|
|
|
|
|
|
|
|
|
|
|
| 839 |
print(
|
| 840 |
+
f" → [{status}] score={result['score']:.2f} steps={result['steps']}",
|
| 841 |
flush=True,
|
| 842 |
)
|
| 843 |
|
| 844 |
+
# Summary grouped by difficulty tier
|
| 845 |
+
print("\n" + "="*70, flush=True)
|
| 846 |
+
print("[SUMMARY]", flush=True)
|
| 847 |
+
for tier in ["easy", "medium", "hard"]:
|
| 848 |
+
tier_results = [r for r in results if r.get("difficulty") == tier]
|
| 849 |
+
if not tier_results:
|
| 850 |
+
continue
|
| 851 |
+
avg = sum(r["score"] for r in tier_results) / len(tier_results)
|
| 852 |
+
passes = sum(1 for r in tier_results if r["success"])
|
| 853 |
+
print(f"\n {tier.upper()} ({passes}/{len(tier_results)} passed, avg score={avg:.2f}):", flush=True)
|
| 854 |
+
for r in tier_results:
|
| 855 |
+
status = "PASS" if r["success"] else "FAIL"
|
| 856 |
+
print(f" [{status}] {r['task_name']} — score={r['score']:.2f} steps={r['steps']}", flush=True)
|
| 857 |
+
|
| 858 |
overall_score = sum(r["score"] for r in results) / len(results) if results else 0.0
|
| 859 |
+
print(f"\n OVERALL score={overall_score:.2f} ({sum(1 for r in results if r['success'])}/{len(results)} passed)",
|
| 860 |
+
flush=True)
|
| 861 |
|
| 862 |
|
| 863 |
if __name__ == "__main__":
|
openenv_harvestgym.egg-info/PKG-INFO
CHANGED
|
@@ -11,6 +11,8 @@ Requires-Dist: requests>=2.31.0
|
|
| 11 |
Requires-Dist: rank-bm25>=0.2.2
|
| 12 |
Requires-Dist: openai>=1.0.0
|
| 13 |
Requires-Dist: numpy>=1.24.0
|
|
|
|
|
|
|
| 14 |
Provides-Extra: dev
|
| 15 |
Requires-Dist: pytest>=8.0.0; extra == "dev"
|
| 16 |
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
|
|
|
|
| 11 |
Requires-Dist: rank-bm25>=0.2.2
|
| 12 |
Requires-Dist: openai>=1.0.0
|
| 13 |
Requires-Dist: numpy>=1.24.0
|
| 14 |
+
Requires-Dist: beautifulsoup4>=4.14.3
|
| 15 |
+
Requires-Dist: lxml>=6.0.2
|
| 16 |
Provides-Extra: dev
|
| 17 |
Requires-Dist: pytest>=8.0.0; extra == "dev"
|
| 18 |
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
|
openenv_harvestgym.egg-info/SOURCES.txt
CHANGED
|
@@ -14,6 +14,7 @@ server/models.py
|
|
| 14 |
server/tools/__init__.py
|
| 15 |
server/tools/browser_agent.py
|
| 16 |
server/tools/curl_exec.py
|
|
|
|
| 17 |
server/tools/search_endpoints.py
|
| 18 |
server/tools/search_episode_data.py
|
| 19 |
tests/test_e2e_episode.py
|
|
|
|
| 14 |
server/tools/__init__.py
|
| 15 |
server/tools/browser_agent.py
|
| 16 |
server/tools/curl_exec.py
|
| 17 |
+
server/tools/embed_cache.py
|
| 18 |
server/tools/search_endpoints.py
|
| 19 |
server/tools/search_episode_data.py
|
| 20 |
tests/test_e2e_episode.py
|
openenv_harvestgym.egg-info/requires.txt
CHANGED
|
@@ -6,6 +6,8 @@ requests>=2.31.0
|
|
| 6 |
rank-bm25>=0.2.2
|
| 7 |
openai>=1.0.0
|
| 8 |
numpy>=1.24.0
|
|
|
|
|
|
|
| 9 |
|
| 10 |
[dev]
|
| 11 |
pytest>=8.0.0
|
|
|
|
| 6 |
rank-bm25>=0.2.2
|
| 7 |
openai>=1.0.0
|
| 8 |
numpy>=1.24.0
|
| 9 |
+
beautifulsoup4>=4.14.3
|
| 10 |
+
lxml>=6.0.2
|
| 11 |
|
| 12 |
[dev]
|
| 13 |
pytest>=8.0.0
|
parameter_pools.json
CHANGED
|
@@ -4,7 +4,7 @@
|
|
| 4 |
"generated_at": "2026-04-08",
|
| 5 |
"source": {
|
| 6 |
"categories": "GET /rest/V1/categories/list (live EC2, port 7780)",
|
| 7 |
-
"products": "
|
| 8 |
"forums": "HTML scrape of /forums page (live EC2, port 9999) + HTTP 200 verification per slug",
|
| 9 |
"wikipedia": "Well-known Wikipedia titles \u2014 verified by grader at runtime via HEAD /wikipedia_en.../A/{slug}",
|
| 10 |
"admin_skus": "Generated (HAR-TEST-NNN namespace, no collision with existing catalog)",
|
|
@@ -13,10 +13,10 @@
|
|
| 13 |
"grader_matching_notes": {
|
| 14 |
"template_1": "category_id stored for grader; category_name is what appears in task string",
|
| 15 |
"template_2": "expected_slug stored for grader (verifies HTTP 200); display title is in task string",
|
| 16 |
-
"template_3": "
|
| 17 |
"template_4": "forum_name must exist and return posts; no exact value matching needed",
|
| 18 |
"template_5": "title is free-form generated; grader only checks post was created in that forum",
|
| 19 |
-
"template_6": "
|
| 20 |
"template_7": "sku+price are exact \u2014 grader calls GET /rest/V1/products/{sku} to verify creation"
|
| 21 |
}
|
| 22 |
},
|
|
@@ -29,149 +29,37 @@
|
|
| 29 |
],
|
| 30 |
"pool": {
|
| 31 |
"category_name": [
|
| 32 |
-
{
|
| 33 |
-
"name": "Gear",
|
| 34 |
-
"category_id": 3
|
| 35 |
-
},
|
| 36 |
{
|
| 37 |
"name": "Bags",
|
| 38 |
"category_id": 4
|
| 39 |
},
|
| 40 |
{
|
| 41 |
-
"name": "
|
| 42 |
-
"category_id":
|
| 43 |
-
},
|
| 44 |
-
{
|
| 45 |
-
"name": "Watches",
|
| 46 |
-
"category_id": 6
|
| 47 |
-
},
|
| 48 |
-
{
|
| 49 |
-
"name": "New Luma Yoga Collection",
|
| 50 |
-
"category_id": 8
|
| 51 |
-
},
|
| 52 |
-
{
|
| 53 |
-
"name": "Training",
|
| 54 |
-
"category_id": 9
|
| 55 |
-
},
|
| 56 |
-
{
|
| 57 |
-
"name": "Video Download",
|
| 58 |
-
"category_id": 10
|
| 59 |
-
},
|
| 60 |
-
{
|
| 61 |
-
"name": "Men",
|
| 62 |
-
"category_id": 11
|
| 63 |
-
},
|
| 64 |
-
{
|
| 65 |
-
"name": "Tops",
|
| 66 |
-
"category_id": 12
|
| 67 |
-
},
|
| 68 |
-
{
|
| 69 |
-
"name": "Bottoms",
|
| 70 |
-
"category_id": 13
|
| 71 |
-
},
|
| 72 |
-
{
|
| 73 |
-
"name": "Jackets",
|
| 74 |
-
"category_id": 14
|
| 75 |
-
},
|
| 76 |
-
{
|
| 77 |
-
"name": "Hoodies & Sweatshirts",
|
| 78 |
-
"category_id": 15
|
| 79 |
-
},
|
| 80 |
-
{
|
| 81 |
-
"name": "Tees",
|
| 82 |
-
"category_id": 16
|
| 83 |
-
},
|
| 84 |
-
{
|
| 85 |
-
"name": "Tanks",
|
| 86 |
-
"category_id": 17
|
| 87 |
-
},
|
| 88 |
-
{
|
| 89 |
-
"name": "Pants",
|
| 90 |
-
"category_id": 18
|
| 91 |
-
},
|
| 92 |
-
{
|
| 93 |
-
"name": "Shorts",
|
| 94 |
-
"category_id": 19
|
| 95 |
-
},
|
| 96 |
-
{
|
| 97 |
-
"name": "Women",
|
| 98 |
-
"category_id": 20
|
| 99 |
-
},
|
| 100 |
-
{
|
| 101 |
-
"name": "Tops",
|
| 102 |
-
"category_id": 21
|
| 103 |
-
},
|
| 104 |
-
{
|
| 105 |
-
"name": "Bottoms",
|
| 106 |
-
"category_id": 22
|
| 107 |
},
|
| 108 |
{
|
| 109 |
"name": "Jackets",
|
| 110 |
-
"category_id":
|
| 111 |
-
},
|
| 112 |
-
{
|
| 113 |
-
"name": "Hoodies & Sweatshirts",
|
| 114 |
-
"category_id": 24
|
| 115 |
-
},
|
| 116 |
-
{
|
| 117 |
-
"name": "Tees",
|
| 118 |
-
"category_id": 25
|
| 119 |
-
},
|
| 120 |
-
{
|
| 121 |
-
"name": "Bras & Tanks",
|
| 122 |
-
"category_id": 26
|
| 123 |
-
},
|
| 124 |
-
{
|
| 125 |
-
"name": "Pants",
|
| 126 |
-
"category_id": 27
|
| 127 |
-
},
|
| 128 |
-
{
|
| 129 |
-
"name": "Shorts",
|
| 130 |
-
"category_id": 28
|
| 131 |
-
},
|
| 132 |
-
{
|
| 133 |
-
"name": "Women Sale",
|
| 134 |
-
"category_id": 30
|
| 135 |
},
|
| 136 |
{
|
| 137 |
-
"name": "
|
| 138 |
-
"category_id":
|
| 139 |
},
|
| 140 |
{
|
| 141 |
"name": "Pants",
|
| 142 |
-
"category_id":
|
| 143 |
-
},
|
| 144 |
-
{
|
| 145 |
-
"name": "Tees",
|
| 146 |
-
"category_id": 33
|
| 147 |
-
},
|
| 148 |
-
{
|
| 149 |
-
"name": "Erin Recommends",
|
| 150 |
-
"category_id": 34
|
| 151 |
-
},
|
| 152 |
-
{
|
| 153 |
-
"name": "Performance Fabrics",
|
| 154 |
-
"category_id": 35
|
| 155 |
-
},
|
| 156 |
-
{
|
| 157 |
-
"name": "Eco Friendly",
|
| 158 |
-
"category_id": 36
|
| 159 |
-
},
|
| 160 |
-
{
|
| 161 |
-
"name": "Sale",
|
| 162 |
-
"category_id": 37
|
| 163 |
},
|
| 164 |
{
|
| 165 |
-
"name": "
|
| 166 |
-
"category_id":
|
| 167 |
},
|
| 168 |
{
|
| 169 |
-
"name": "
|
| 170 |
-
"category_id":
|
| 171 |
},
|
| 172 |
{
|
| 173 |
-
"name": "
|
| 174 |
-
"category_id":
|
| 175 |
}
|
| 176 |
]
|
| 177 |
}
|
|
@@ -298,236 +186,94 @@
|
|
| 298 |
"pool": {
|
| 299 |
"product_name": [
|
| 300 |
{
|
| 301 |
-
"name": "
|
| 302 |
-
"sku": "
|
| 303 |
-
|
| 304 |
-
{
|
| 305 |
-
"name": "Strive Shoulder Pack",
|
| 306 |
-
"sku": "24-MB04"
|
| 307 |
-
},
|
| 308 |
-
{
|
| 309 |
-
"name": "Crown Summit Backpack",
|
| 310 |
-
"sku": "24-MB03"
|
| 311 |
-
},
|
| 312 |
-
{
|
| 313 |
-
"name": "Wayfarer Messenger Bag",
|
| 314 |
-
"sku": "24-MB05"
|
| 315 |
-
},
|
| 316 |
-
{
|
| 317 |
-
"name": "Rival Field Messenger",
|
| 318 |
-
"sku": "24-MB06"
|
| 319 |
-
},
|
| 320 |
-
{
|
| 321 |
-
"name": "Fusion Backpack",
|
| 322 |
-
"sku": "24-MB02"
|
| 323 |
-
},
|
| 324 |
-
{
|
| 325 |
-
"name": "Impulse Duffle",
|
| 326 |
-
"sku": "24-UB02"
|
| 327 |
-
},
|
| 328 |
-
{
|
| 329 |
-
"name": "Voyage Yoga Bag",
|
| 330 |
-
"sku": "24-WB01"
|
| 331 |
-
},
|
| 332 |
-
{
|
| 333 |
-
"name": "Compete Track Tote",
|
| 334 |
-
"sku": "24-WB02"
|
| 335 |
-
},
|
| 336 |
-
{
|
| 337 |
-
"name": "Savvy Shoulder Tote",
|
| 338 |
-
"sku": "24-WB05"
|
| 339 |
-
},
|
| 340 |
-
{
|
| 341 |
-
"name": "Endeavor Daytrip Backpack",
|
| 342 |
-
"sku": "24-WB06"
|
| 343 |
-
},
|
| 344 |
-
{
|
| 345 |
-
"name": "Driven Backpack",
|
| 346 |
-
"sku": "24-WB03"
|
| 347 |
-
},
|
| 348 |
-
{
|
| 349 |
-
"name": "Overnight Duffle",
|
| 350 |
-
"sku": "24-WB07"
|
| 351 |
-
},
|
| 352 |
-
{
|
| 353 |
-
"name": "Push It Messenger Bag",
|
| 354 |
-
"sku": "24-WB04"
|
| 355 |
-
},
|
| 356 |
-
{
|
| 357 |
-
"name": "Affirm Water Bottle",
|
| 358 |
-
"sku": "24-UG06"
|
| 359 |
-
},
|
| 360 |
-
{
|
| 361 |
-
"name": "Dual Handle Cardio Ball",
|
| 362 |
-
"sku": "24-UG07"
|
| 363 |
-
},
|
| 364 |
-
{
|
| 365 |
-
"name": "Zing Jump Rope",
|
| 366 |
-
"sku": "24-UG04"
|
| 367 |
-
},
|
| 368 |
-
{
|
| 369 |
-
"name": "Pursuit Lumaflex™ Tone Band",
|
| 370 |
-
"sku": "24-UG02"
|
| 371 |
-
},
|
| 372 |
-
{
|
| 373 |
-
"name": "Go-Get'r Pushup Grips",
|
| 374 |
-
"sku": "24-UG05"
|
| 375 |
-
},
|
| 376 |
-
{
|
| 377 |
-
"name": "Quest Lumaflex™ Band",
|
| 378 |
-
"sku": "24-UG01"
|
| 379 |
-
},
|
| 380 |
-
{
|
| 381 |
-
"name": "Sprite Foam Yoga Brick",
|
| 382 |
-
"sku": "24-WG084"
|
| 383 |
-
},
|
| 384 |
-
{
|
| 385 |
-
"name": "Sprite Foam Roller",
|
| 386 |
-
"sku": "24-WG088"
|
| 387 |
-
},
|
| 388 |
-
{
|
| 389 |
-
"name": "Harmony Lumaflex™ Strength Band Kit",
|
| 390 |
-
"sku": "24-UG03"
|
| 391 |
-
},
|
| 392 |
-
{
|
| 393 |
-
"name": "Sprite Stasis Ball 55 cm",
|
| 394 |
-
"sku": "24-WG081-gray"
|
| 395 |
-
},
|
| 396 |
-
{
|
| 397 |
-
"name": "Sprite Stasis Ball 65 cm",
|
| 398 |
-
"sku": "24-WG082-gray"
|
| 399 |
-
},
|
| 400 |
-
{
|
| 401 |
-
"name": "Sprite Stasis Ball 75 cm",
|
| 402 |
-
"sku": "24-WG083-gray"
|
| 403 |
-
},
|
| 404 |
-
{
|
| 405 |
-
"name": "Sprite Yoga Strap 6 foot",
|
| 406 |
-
"sku": "24-WG085"
|
| 407 |
-
},
|
| 408 |
-
{
|
| 409 |
-
"name": "Sprite Yoga Strap 8 foot",
|
| 410 |
-
"sku": "24-WG086"
|
| 411 |
-
},
|
| 412 |
-
{
|
| 413 |
-
"name": "Sprite Yoga Strap 10 foot",
|
| 414 |
-
"sku": "24-WG087"
|
| 415 |
-
},
|
| 416 |
-
{
|
| 417 |
-
"name": "Aim Analog Watch",
|
| 418 |
-
"sku": "24-MG04"
|
| 419 |
-
},
|
| 420 |
-
{
|
| 421 |
-
"name": "Endurance Watch",
|
| 422 |
-
"sku": "24-MG01"
|
| 423 |
-
},
|
| 424 |
-
{
|
| 425 |
-
"name": "Summit Watch",
|
| 426 |
-
"sku": "24-MG03"
|
| 427 |
-
},
|
| 428 |
-
{
|
| 429 |
-
"name": "Cruise Dual Analog Watch",
|
| 430 |
-
"sku": "24-MG05"
|
| 431 |
-
},
|
| 432 |
-
{
|
| 433 |
-
"name": "Dash Digital Watch",
|
| 434 |
-
"sku": "24-MG02"
|
| 435 |
-
},
|
| 436 |
-
{
|
| 437 |
-
"name": "Luma Analog Watch",
|
| 438 |
-
"sku": "24-WG09"
|
| 439 |
-
},
|
| 440 |
-
{
|
| 441 |
-
"name": "Bolo Sport Watch",
|
| 442 |
-
"sku": "24-WG01"
|
| 443 |
-
},
|
| 444 |
-
{
|
| 445 |
-
"name": "Clamber Watch",
|
| 446 |
-
"sku": "24-WG03"
|
| 447 |
-
},
|
| 448 |
-
{
|
| 449 |
-
"name": "Didi Sport Watch",
|
| 450 |
-
"sku": "24-WG02"
|
| 451 |
},
|
| 452 |
{
|
| 453 |
-
"name": "
|
| 454 |
-
"sku": "
|
|
|
|
| 455 |
},
|
| 456 |
{
|
| 457 |
-
"name": "
|
| 458 |
-
"sku": "
|
|
|
|
| 459 |
},
|
| 460 |
{
|
| 461 |
-
"name": "
|
| 462 |
-
"sku": "
|
|
|
|
| 463 |
},
|
| 464 |
{
|
| 465 |
-
"name": "
|
| 466 |
-
"sku": "
|
|
|
|
| 467 |
},
|
| 468 |
{
|
| 469 |
-
"name": "
|
| 470 |
-
"sku": "
|
|
|
|
| 471 |
},
|
| 472 |
{
|
| 473 |
-
"name": "
|
| 474 |
-
"sku": "
|
|
|
|
| 475 |
},
|
| 476 |
{
|
| 477 |
-
"name": "
|
| 478 |
-
"sku": "
|
|
|
|
| 479 |
},
|
| 480 |
{
|
| 481 |
-
"name": "
|
| 482 |
-
"sku": "
|
|
|
|
| 483 |
},
|
| 484 |
{
|
| 485 |
-
"name": "
|
| 486 |
-
"sku": "
|
|
|
|
| 487 |
},
|
| 488 |
{
|
| 489 |
-
"name": "
|
| 490 |
-
"sku": "
|
|
|
|
| 491 |
},
|
| 492 |
{
|
| 493 |
-
"name": "
|
| 494 |
-
"sku": "
|
|
|
|
| 495 |
},
|
| 496 |
{
|
| 497 |
-
"name": "
|
| 498 |
-
"sku": "
|
|
|
|
| 499 |
},
|
| 500 |
{
|
| 501 |
-
"name": "
|
| 502 |
-
"sku": "
|
|
|
|
| 503 |
},
|
| 504 |
{
|
| 505 |
-
"name": "
|
| 506 |
-
"sku": "
|
|
|
|
| 507 |
},
|
| 508 |
{
|
| 509 |
-
"name": "
|
| 510 |
-
"sku": "
|
|
|
|
| 511 |
},
|
| 512 |
{
|
| 513 |
-
"name": "
|
| 514 |
-
"sku": "
|
|
|
|
| 515 |
},
|
| 516 |
{
|
| 517 |
-
"name": "
|
| 518 |
-
"sku": "
|
| 519 |
-
|
| 520 |
-
{
|
| 521 |
-
"name": "Stark Fundamental Hoodie",
|
| 522 |
-
"sku": "MH06"
|
| 523 |
-
},
|
| 524 |
-
{
|
| 525 |
-
"name": "Hero Hoodie",
|
| 526 |
-
"sku": "MH07"
|
| 527 |
-
},
|
| 528 |
-
{
|
| 529 |
-
"name": "Oslo Trek Hoodie",
|
| 530 |
-
"sku": "MH08"
|
| 531 |
}
|
| 532 |
]
|
| 533 |
}
|
|
@@ -739,236 +485,94 @@
|
|
| 739 |
"pool": {
|
| 740 |
"product_name": [
|
| 741 |
{
|
| 742 |
-
"name": "
|
| 743 |
-
"sku": "
|
| 744 |
-
|
| 745 |
-
{
|
| 746 |
-
"name": "Strive Shoulder Pack",
|
| 747 |
-
"sku": "24-MB04"
|
| 748 |
-
},
|
| 749 |
-
{
|
| 750 |
-
"name": "Crown Summit Backpack",
|
| 751 |
-
"sku": "24-MB03"
|
| 752 |
-
},
|
| 753 |
-
{
|
| 754 |
-
"name": "Wayfarer Messenger Bag",
|
| 755 |
-
"sku": "24-MB05"
|
| 756 |
-
},
|
| 757 |
-
{
|
| 758 |
-
"name": "Rival Field Messenger",
|
| 759 |
-
"sku": "24-MB06"
|
| 760 |
-
},
|
| 761 |
-
{
|
| 762 |
-
"name": "Fusion Backpack",
|
| 763 |
-
"sku": "24-MB02"
|
| 764 |
-
},
|
| 765 |
-
{
|
| 766 |
-
"name": "Impulse Duffle",
|
| 767 |
-
"sku": "24-UB02"
|
| 768 |
-
},
|
| 769 |
-
{
|
| 770 |
-
"name": "Voyage Yoga Bag",
|
| 771 |
-
"sku": "24-WB01"
|
| 772 |
-
},
|
| 773 |
-
{
|
| 774 |
-
"name": "Compete Track Tote",
|
| 775 |
-
"sku": "24-WB02"
|
| 776 |
-
},
|
| 777 |
-
{
|
| 778 |
-
"name": "Savvy Shoulder Tote",
|
| 779 |
-
"sku": "24-WB05"
|
| 780 |
-
},
|
| 781 |
-
{
|
| 782 |
-
"name": "Endeavor Daytrip Backpack",
|
| 783 |
-
"sku": "24-WB06"
|
| 784 |
-
},
|
| 785 |
-
{
|
| 786 |
-
"name": "Driven Backpack",
|
| 787 |
-
"sku": "24-WB03"
|
| 788 |
-
},
|
| 789 |
-
{
|
| 790 |
-
"name": "Overnight Duffle",
|
| 791 |
-
"sku": "24-WB07"
|
| 792 |
-
},
|
| 793 |
-
{
|
| 794 |
-
"name": "Push It Messenger Bag",
|
| 795 |
-
"sku": "24-WB04"
|
| 796 |
-
},
|
| 797 |
-
{
|
| 798 |
-
"name": "Affirm Water Bottle",
|
| 799 |
-
"sku": "24-UG06"
|
| 800 |
-
},
|
| 801 |
-
{
|
| 802 |
-
"name": "Dual Handle Cardio Ball",
|
| 803 |
-
"sku": "24-UG07"
|
| 804 |
-
},
|
| 805 |
-
{
|
| 806 |
-
"name": "Zing Jump Rope",
|
| 807 |
-
"sku": "24-UG04"
|
| 808 |
-
},
|
| 809 |
-
{
|
| 810 |
-
"name": "Pursuit Lumaflex™ Tone Band",
|
| 811 |
-
"sku": "24-UG02"
|
| 812 |
-
},
|
| 813 |
-
{
|
| 814 |
-
"name": "Go-Get'r Pushup Grips",
|
| 815 |
-
"sku": "24-UG05"
|
| 816 |
-
},
|
| 817 |
-
{
|
| 818 |
-
"name": "Quest Lumaflex™ Band",
|
| 819 |
-
"sku": "24-UG01"
|
| 820 |
-
},
|
| 821 |
-
{
|
| 822 |
-
"name": "Sprite Foam Yoga Brick",
|
| 823 |
-
"sku": "24-WG084"
|
| 824 |
-
},
|
| 825 |
-
{
|
| 826 |
-
"name": "Sprite Foam Roller",
|
| 827 |
-
"sku": "24-WG088"
|
| 828 |
-
},
|
| 829 |
-
{
|
| 830 |
-
"name": "Harmony Lumaflex™ Strength Band Kit",
|
| 831 |
-
"sku": "24-UG03"
|
| 832 |
-
},
|
| 833 |
-
{
|
| 834 |
-
"name": "Sprite Stasis Ball 55 cm",
|
| 835 |
-
"sku": "24-WG081-gray"
|
| 836 |
-
},
|
| 837 |
-
{
|
| 838 |
-
"name": "Sprite Stasis Ball 65 cm",
|
| 839 |
-
"sku": "24-WG082-gray"
|
| 840 |
-
},
|
| 841 |
-
{
|
| 842 |
-
"name": "Sprite Stasis Ball 75 cm",
|
| 843 |
-
"sku": "24-WG083-gray"
|
| 844 |
-
},
|
| 845 |
-
{
|
| 846 |
-
"name": "Sprite Yoga Strap 6 foot",
|
| 847 |
-
"sku": "24-WG085"
|
| 848 |
-
},
|
| 849 |
-
{
|
| 850 |
-
"name": "Sprite Yoga Strap 8 foot",
|
| 851 |
-
"sku": "24-WG086"
|
| 852 |
-
},
|
| 853 |
-
{
|
| 854 |
-
"name": "Sprite Yoga Strap 10 foot",
|
| 855 |
-
"sku": "24-WG087"
|
| 856 |
-
},
|
| 857 |
-
{
|
| 858 |
-
"name": "Aim Analog Watch",
|
| 859 |
-
"sku": "24-MG04"
|
| 860 |
-
},
|
| 861 |
-
{
|
| 862 |
-
"name": "Endurance Watch",
|
| 863 |
-
"sku": "24-MG01"
|
| 864 |
-
},
|
| 865 |
-
{
|
| 866 |
-
"name": "Summit Watch",
|
| 867 |
-
"sku": "24-MG03"
|
| 868 |
-
},
|
| 869 |
-
{
|
| 870 |
-
"name": "Cruise Dual Analog Watch",
|
| 871 |
-
"sku": "24-MG05"
|
| 872 |
-
},
|
| 873 |
-
{
|
| 874 |
-
"name": "Dash Digital Watch",
|
| 875 |
-
"sku": "24-MG02"
|
| 876 |
-
},
|
| 877 |
-
{
|
| 878 |
-
"name": "Luma Analog Watch",
|
| 879 |
-
"sku": "24-WG09"
|
| 880 |
-
},
|
| 881 |
-
{
|
| 882 |
-
"name": "Bolo Sport Watch",
|
| 883 |
-
"sku": "24-WG01"
|
| 884 |
-
},
|
| 885 |
-
{
|
| 886 |
-
"name": "Clamber Watch",
|
| 887 |
-
"sku": "24-WG03"
|
| 888 |
-
},
|
| 889 |
-
{
|
| 890 |
-
"name": "Didi Sport Watch",
|
| 891 |
-
"sku": "24-WG02"
|
| 892 |
-
},
|
| 893 |
-
{
|
| 894 |
-
"name": "Stellar Solar Jacket",
|
| 895 |
-
"sku": "WJ01"
|
| 896 |
-
},
|
| 897 |
-
{
|
| 898 |
-
"name": "Josie Yoga Jacket",
|
| 899 |
-
"sku": "WJ02"
|
| 900 |
-
},
|
| 901 |
-
{
|
| 902 |
-
"name": "Augusta Pullover Jacket",
|
| 903 |
-
"sku": "WJ03"
|
| 904 |
},
|
| 905 |
{
|
| 906 |
-
"name": "
|
| 907 |
-
"sku": "
|
|
|
|
| 908 |
},
|
| 909 |
{
|
| 910 |
-
"name": "
|
| 911 |
-
"sku": "
|
|
|
|
| 912 |
},
|
| 913 |
{
|
| 914 |
-
"name": "
|
| 915 |
-
"sku": "
|
|
|
|
| 916 |
},
|
| 917 |
{
|
| 918 |
-
"name": "
|
| 919 |
-
"sku": "
|
|
|
|
| 920 |
},
|
| 921 |
{
|
| 922 |
-
"name": "
|
| 923 |
-
"sku": "
|
|
|
|
| 924 |
},
|
| 925 |
{
|
| 926 |
-
"name": "
|
| 927 |
-
"sku": "
|
|
|
|
| 928 |
},
|
| 929 |
{
|
| 930 |
-
"name": "
|
| 931 |
-
"sku": "
|
|
|
|
| 932 |
},
|
| 933 |
{
|
| 934 |
-
"name": "
|
| 935 |
-
"sku": "
|
|
|
|
| 936 |
},
|
| 937 |
{
|
| 938 |
-
"name": "
|
| 939 |
-
"sku": "
|
|
|
|
| 940 |
},
|
| 941 |
{
|
| 942 |
-
"name": "
|
| 943 |
-
"sku": "
|
|
|
|
| 944 |
},
|
| 945 |
{
|
| 946 |
-
"name": "
|
| 947 |
-
"sku": "
|
|
|
|
| 948 |
},
|
| 949 |
{
|
| 950 |
-
"name": "
|
| 951 |
-
"sku": "
|
|
|
|
| 952 |
},
|
| 953 |
{
|
| 954 |
-
"name": "
|
| 955 |
-
"sku": "
|
|
|
|
| 956 |
},
|
| 957 |
{
|
| 958 |
-
"name": "
|
| 959 |
-
"sku": "
|
|
|
|
| 960 |
},
|
| 961 |
{
|
| 962 |
-
"name": "
|
| 963 |
-
"sku": "
|
|
|
|
| 964 |
},
|
| 965 |
{
|
| 966 |
-
"name": "
|
| 967 |
-
"sku": "
|
|
|
|
| 968 |
},
|
| 969 |
{
|
| 970 |
-
"name": "
|
| 971 |
-
"sku": "
|
|
|
|
| 972 |
}
|
| 973 |
]
|
| 974 |
}
|
|
|
|
| 4 |
"generated_at": "2026-04-08",
|
| 5 |
"source": {
|
| 6 |
"categories": "GET /rest/V1/categories/list (live EC2, port 7780)",
|
| 7 |
+
"products": "HTML scrape of search results page on live EC2 store (port 7770) \u2014 product_id is the Magento entity ID used in add-to-cart forms; sku is PROD-{product_id} as the store REST API is auth-gated",
|
| 8 |
"forums": "HTML scrape of /forums page (live EC2, port 9999) + HTTP 200 verification per slug",
|
| 9 |
"wikipedia": "Well-known Wikipedia titles \u2014 verified by grader at runtime via HEAD /wikipedia_en.../A/{slug}",
|
| 10 |
"admin_skus": "Generated (HAR-TEST-NNN namespace, no collision with existing catalog)",
|
|
|
|
| 13 |
"grader_matching_notes": {
|
| 14 |
"template_1": "category_id stored for grader; category_name is what appears in task string",
|
| 15 |
"template_2": "expected_slug stored for grader (verifies HTTP 200); display title is in task string",
|
| 16 |
+
"template_3": "product_id stored for grader (checks POST /checkout/cart/add + cart probe); product name is in task string for HTML search flow",
|
| 17 |
"template_4": "forum_name must exist and return posts; no exact value matching needed",
|
| 18 |
"template_5": "title is free-form generated; grader only checks post was created in that forum",
|
| 19 |
+
"template_6": "product_id stored for grader; name is in task string; checkout grader checks REST guest-cart stages OR HTML checkout flow",
|
| 20 |
"template_7": "sku+price are exact \u2014 grader calls GET /rest/V1/products/{sku} to verify creation"
|
| 21 |
}
|
| 22 |
},
|
|
|
|
| 29 |
],
|
| 30 |
"pool": {
|
| 31 |
"category_name": [
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
{
|
| 33 |
"name": "Bags",
|
| 34 |
"category_id": 4
|
| 35 |
},
|
| 36 |
{
|
| 37 |
+
"name": "Backpack",
|
| 38 |
+
"category_id": 4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
},
|
| 40 |
{
|
| 41 |
"name": "Jackets",
|
| 42 |
+
"category_id": 11
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
},
|
| 44 |
{
|
| 45 |
+
"name": "Hoodies",
|
| 46 |
+
"category_id": 9
|
| 47 |
},
|
| 48 |
{
|
| 49 |
"name": "Pants",
|
| 50 |
+
"category_id": 13
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
},
|
| 52 |
{
|
| 53 |
+
"name": "Shoes",
|
| 54 |
+
"category_id": 3
|
| 55 |
},
|
| 56 |
{
|
| 57 |
+
"name": "Boots",
|
| 58 |
+
"category_id": 3
|
| 59 |
},
|
| 60 |
{
|
| 61 |
+
"name": "Slippers",
|
| 62 |
+
"category_id": 3
|
| 63 |
}
|
| 64 |
]
|
| 65 |
}
|
|
|
|
| 186 |
"pool": {
|
| 187 |
"product_name": [
|
| 188 |
{
|
| 189 |
+
"name": "Camera Backpack Bagsmar DSLR Waterproof",
|
| 190 |
+
"sku": "PROD-89940",
|
| 191 |
+
"product_id": 89940
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 192 |
},
|
| 193 |
{
|
| 194 |
+
"name": "Totes Womens Cold Weather Boots Nicole",
|
| 195 |
+
"sku": "PROD-29409",
|
| 196 |
+
"product_id": 29409
|
| 197 |
},
|
| 198 |
{
|
| 199 |
+
"name": "Totes Womens Snow Boots Jami Lace Up",
|
| 200 |
+
"sku": "PROD-83651",
|
| 201 |
+
"product_id": 83651
|
| 202 |
},
|
| 203 |
{
|
| 204 |
+
"name": "Noldares Flannel Jacket For Men Plaid",
|
| 205 |
+
"sku": "PROD-59237",
|
| 206 |
+
"product_id": 59237
|
| 207 |
},
|
| 208 |
{
|
| 209 |
+
"name": "Inesver Womens Leather Jacket Open Front",
|
| 210 |
+
"sku": "PROD-30743",
|
| 211 |
+
"product_id": 30743
|
| 212 |
},
|
| 213 |
{
|
| 214 |
+
"name": "Womens Corduroy Coat Plaid Hoodie Long Jacket",
|
| 215 |
+
"sku": "PROD-13227",
|
| 216 |
+
"product_id": 13227
|
| 217 |
},
|
| 218 |
{
|
| 219 |
+
"name": "Womens Fleece Jacket With Hood Winter",
|
| 220 |
+
"sku": "PROD-60773",
|
| 221 |
+
"product_id": 60773
|
| 222 |
},
|
| 223 |
{
|
| 224 |
+
"name": "Champion Hoodie Big And Tall Zip Up",
|
| 225 |
+
"sku": "PROD-64850",
|
| 226 |
+
"product_id": 64850
|
| 227 |
},
|
| 228 |
{
|
| 229 |
+
"name": "Matching Couples Hoodie Set",
|
| 230 |
+
"sku": "PROD-60915",
|
| 231 |
+
"product_id": 60915
|
| 232 |
},
|
| 233 |
{
|
| 234 |
+
"name": "Mens Novelty 3D Printed Pullover Hoodie",
|
| 235 |
+
"sku": "PROD-62228",
|
| 236 |
+
"product_id": 62228
|
| 237 |
},
|
| 238 |
{
|
| 239 |
+
"name": "Mens Slim Fit Cargo Pants Athletic",
|
| 240 |
+
"sku": "PROD-65987",
|
| 241 |
+
"product_id": 65987
|
| 242 |
},
|
| 243 |
{
|
| 244 |
+
"name": "Mens Ripstop Cargo Pants Tactical Hiking",
|
| 245 |
+
"sku": "PROD-10245",
|
| 246 |
+
"product_id": 10245
|
| 247 |
},
|
| 248 |
{
|
| 249 |
+
"name": "Womens Flowy Boho Harem Pants Yoga",
|
| 250 |
+
"sku": "PROD-64374",
|
| 251 |
+
"product_id": 64374
|
| 252 |
},
|
| 253 |
{
|
| 254 |
+
"name": "Womens High Waist Harem Pants Stripe",
|
| 255 |
+
"sku": "PROD-61333",
|
| 256 |
+
"product_id": 61333
|
| 257 |
},
|
| 258 |
{
|
| 259 |
+
"name": "Shoeslocker Womens Cozy Memory Foam Slippers",
|
| 260 |
+
"sku": "PROD-94779",
|
| 261 |
+
"product_id": 94779
|
| 262 |
},
|
| 263 |
{
|
| 264 |
+
"name": "Mens Canvas Korean Fashion Casual Shoes",
|
| 265 |
+
"sku": "PROD-60868",
|
| 266 |
+
"product_id": 60868
|
| 267 |
},
|
| 268 |
{
|
| 269 |
+
"name": "Unisex Diving Shoes Ultralight Anti Slip",
|
| 270 |
+
"sku": "PROD-12364",
|
| 271 |
+
"product_id": 12364
|
| 272 |
},
|
| 273 |
{
|
| 274 |
+
"name": "Womens Loafers Fashion Retro Single Shoes",
|
| 275 |
+
"sku": "PROD-63738",
|
| 276 |
+
"product_id": 63738
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 277 |
}
|
| 278 |
]
|
| 279 |
}
|
|
|
|
| 485 |
"pool": {
|
| 486 |
"product_name": [
|
| 487 |
{
|
| 488 |
+
"name": "Camera Backpack Bagsmar DSLR Waterproof",
|
| 489 |
+
"sku": "PROD-89940",
|
| 490 |
+
"product_id": 89940
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 491 |
},
|
| 492 |
{
|
| 493 |
+
"name": "Totes Womens Cold Weather Boots Nicole",
|
| 494 |
+
"sku": "PROD-29409",
|
| 495 |
+
"product_id": 29409
|
| 496 |
},
|
| 497 |
{
|
| 498 |
+
"name": "Totes Womens Snow Boots Jami Lace Up",
|
| 499 |
+
"sku": "PROD-83651",
|
| 500 |
+
"product_id": 83651
|
| 501 |
},
|
| 502 |
{
|
| 503 |
+
"name": "Noldares Flannel Jacket For Men Plaid",
|
| 504 |
+
"sku": "PROD-59237",
|
| 505 |
+
"product_id": 59237
|
| 506 |
},
|
| 507 |
{
|
| 508 |
+
"name": "Inesver Womens Leather Jacket Open Front",
|
| 509 |
+
"sku": "PROD-30743",
|
| 510 |
+
"product_id": 30743
|
| 511 |
},
|
| 512 |
{
|
| 513 |
+
"name": "Womens Corduroy Coat Plaid Hoodie Long Jacket",
|
| 514 |
+
"sku": "PROD-13227",
|
| 515 |
+
"product_id": 13227
|
| 516 |
},
|
| 517 |
{
|
| 518 |
+
"name": "Womens Fleece Jacket With Hood Winter",
|
| 519 |
+
"sku": "PROD-60773",
|
| 520 |
+
"product_id": 60773
|
| 521 |
},
|
| 522 |
{
|
| 523 |
+
"name": "Champion Hoodie Big And Tall Zip Up",
|
| 524 |
+
"sku": "PROD-64850",
|
| 525 |
+
"product_id": 64850
|
| 526 |
},
|
| 527 |
{
|
| 528 |
+
"name": "Matching Couples Hoodie Set",
|
| 529 |
+
"sku": "PROD-60915",
|
| 530 |
+
"product_id": 60915
|
| 531 |
},
|
| 532 |
{
|
| 533 |
+
"name": "Mens Novelty 3D Printed Pullover Hoodie",
|
| 534 |
+
"sku": "PROD-62228",
|
| 535 |
+
"product_id": 62228
|
| 536 |
},
|
| 537 |
{
|
| 538 |
+
"name": "Mens Slim Fit Cargo Pants Athletic",
|
| 539 |
+
"sku": "PROD-65987",
|
| 540 |
+
"product_id": 65987
|
| 541 |
},
|
| 542 |
{
|
| 543 |
+
"name": "Mens Ripstop Cargo Pants Tactical Hiking",
|
| 544 |
+
"sku": "PROD-10245",
|
| 545 |
+
"product_id": 10245
|
| 546 |
},
|
| 547 |
{
|
| 548 |
+
"name": "Womens Flowy Boho Harem Pants Yoga",
|
| 549 |
+
"sku": "PROD-64374",
|
| 550 |
+
"product_id": 64374
|
| 551 |
},
|
| 552 |
{
|
| 553 |
+
"name": "Womens High Waist Harem Pants Stripe",
|
| 554 |
+
"sku": "PROD-61333",
|
| 555 |
+
"product_id": 61333
|
| 556 |
},
|
| 557 |
{
|
| 558 |
+
"name": "Shoeslocker Womens Cozy Memory Foam Slippers",
|
| 559 |
+
"sku": "PROD-94779",
|
| 560 |
+
"product_id": 94779
|
| 561 |
},
|
| 562 |
{
|
| 563 |
+
"name": "Mens Canvas Korean Fashion Casual Shoes",
|
| 564 |
+
"sku": "PROD-60868",
|
| 565 |
+
"product_id": 60868
|
| 566 |
},
|
| 567 |
{
|
| 568 |
+
"name": "Unisex Diving Shoes Ultralight Anti Slip",
|
| 569 |
+
"sku": "PROD-12364",
|
| 570 |
+
"product_id": 12364
|
| 571 |
},
|
| 572 |
{
|
| 573 |
+
"name": "Womens Loafers Fashion Retro Single Shoes",
|
| 574 |
+
"sku": "PROD-63738",
|
| 575 |
+
"product_id": 63738
|
| 576 |
}
|
| 577 |
]
|
| 578 |
}
|
pyproject.toml
CHANGED
|
@@ -16,6 +16,8 @@ dependencies = [
|
|
| 16 |
"rank-bm25>=0.2.2",
|
| 17 |
"openai>=1.0.0",
|
| 18 |
"numpy>=1.24.0",
|
|
|
|
|
|
|
| 19 |
]
|
| 20 |
|
| 21 |
[project.optional-dependencies]
|
|
|
|
| 16 |
"rank-bm25>=0.2.2",
|
| 17 |
"openai>=1.0.0",
|
| 18 |
"numpy>=1.24.0",
|
| 19 |
+
"beautifulsoup4>=4.14.3",
|
| 20 |
+
"lxml>=6.0.2",
|
| 21 |
]
|
| 22 |
|
| 23 |
[project.optional-dependencies]
|
scripts/inspect_har_endpoints.py
ADDED
|
@@ -0,0 +1,240 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
inspect_har_endpoints.py
|
| 4 |
+
|
| 5 |
+
Runs extract_openapi_spec() on every HAR file in hars/ and prints a full
|
| 6 |
+
summary of discovered endpoints — method, path, status code, auth, and a
|
| 7 |
+
snippet of the request/response body where available.
|
| 8 |
+
|
| 9 |
+
Usage:
|
| 10 |
+
python scripts/inspect_har_endpoints.py [--json]
|
| 11 |
+
|
| 12 |
+
Flags:
|
| 13 |
+
--json Emit machine-readable JSON instead of the human-readable table
|
| 14 |
+
"""
|
| 15 |
+
|
| 16 |
+
from __future__ import annotations
|
| 17 |
+
|
| 18 |
+
import json
|
| 19 |
+
import sys
|
| 20 |
+
from pathlib import Path
|
| 21 |
+
|
| 22 |
+
# ---------------------------------------------------------------------------
|
| 23 |
+
# Path setup — make the package importable without installing
|
| 24 |
+
# ---------------------------------------------------------------------------
|
| 25 |
+
|
| 26 |
+
REPO_ROOT = Path(__file__).resolve().parent.parent
|
| 27 |
+
sys.path.insert(0, str(REPO_ROOT))
|
| 28 |
+
|
| 29 |
+
from server.tools.browser_agent import extract_openapi_spec # noqa: E402
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
# ---------------------------------------------------------------------------
|
| 33 |
+
# HAR files to inspect
|
| 34 |
+
# ---------------------------------------------------------------------------
|
| 35 |
+
|
| 36 |
+
HARS_DIR = REPO_ROOT / "hars"
|
| 37 |
+
|
| 38 |
+
HAR_FILES = {
|
| 39 |
+
"shopping": HARS_DIR / "shopping.har",
|
| 40 |
+
"shopping_admin": HARS_DIR / "shopping_admin.har",
|
| 41 |
+
"forum": HARS_DIR / "forum.har",
|
| 42 |
+
"wikipedia": HARS_DIR / "wikipedia.har",
|
| 43 |
+
}
|
| 44 |
+
|
| 45 |
+
# Fake base URLs — only used for pass-through in extract_openapi_spec
|
| 46 |
+
APP_BASE_URLS = {
|
| 47 |
+
"shopping": "http://localhost:7770",
|
| 48 |
+
"shopping_admin": "http://localhost:7780",
|
| 49 |
+
"forum": "http://localhost:9999",
|
| 50 |
+
"wikipedia": "http://localhost:8888",
|
| 51 |
+
}
|
| 52 |
+
|
| 53 |
+
|
| 54 |
+
# ---------------------------------------------------------------------------
|
| 55 |
+
# Pretty-print helpers
|
| 56 |
+
# ---------------------------------------------------------------------------
|
| 57 |
+
|
| 58 |
+
_COL_W = 80
|
| 59 |
+
|
| 60 |
+
|
| 61 |
+
def _hr(char: str = "─") -> None:
|
| 62 |
+
print(char * _COL_W)
|
| 63 |
+
|
| 64 |
+
|
| 65 |
+
def _body_snippet(value) -> str | None:
|
| 66 |
+
if value is None:
|
| 67 |
+
return None
|
| 68 |
+
if isinstance(value, str):
|
| 69 |
+
snippet = value[:120]
|
| 70 |
+
else:
|
| 71 |
+
snippet = json.dumps(value)[:120]
|
| 72 |
+
return snippet + ("…" if len(str(snippet)) >= 120 else "")
|
| 73 |
+
|
| 74 |
+
|
| 75 |
+
def _print_entry(idx: int, entry: dict) -> None:
|
| 76 |
+
auth_flag = "🔐 AUTH" if entry["auth_observed"] else "open"
|
| 77 |
+
print(f" [{idx:>3}] {entry['method']:<7} {entry['path']}")
|
| 78 |
+
print(f" status={entry['status_code']} ct={entry['response_content_type'] or '—'} {auth_flag}")
|
| 79 |
+
if entry.get("query_params"):
|
| 80 |
+
print(f" query: {entry['query_params'][:100]}")
|
| 81 |
+
req_snippet = _body_snippet(entry.get("request_body"))
|
| 82 |
+
if req_snippet:
|
| 83 |
+
print(f" req_body: {req_snippet}")
|
| 84 |
+
resp_snippet = _body_snippet(entry.get("response_body_sample"))
|
| 85 |
+
if resp_snippet:
|
| 86 |
+
print(f" resp_sample: {resp_snippet}")
|
| 87 |
+
|
| 88 |
+
|
| 89 |
+
def _method_counts(entries: list[dict]) -> dict[str, int]:
|
| 90 |
+
counts: dict[str, int] = {}
|
| 91 |
+
for e in entries:
|
| 92 |
+
counts[e["method"]] = counts.get(e["method"], 0) + 1
|
| 93 |
+
return dict(sorted(counts.items()))
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
def print_app_summary(app_name: str, entries: list[dict], raw_total: int | None = None) -> None:
|
| 97 |
+
_hr("═")
|
| 98 |
+
header = f" APP: {app_name.upper()} ({len(entries)} unique API endpoints"
|
| 99 |
+
if raw_total is not None:
|
| 100 |
+
header += f" extracted from {raw_total} raw HAR entries"
|
| 101 |
+
header += ")"
|
| 102 |
+
print(header)
|
| 103 |
+
counts = _method_counts(entries)
|
| 104 |
+
print(f" Methods: {counts}")
|
| 105 |
+
auth_count = sum(1 for e in entries if e["auth_observed"])
|
| 106 |
+
print(f" Auth-required endpoints: {auth_count}/{len(entries)}")
|
| 107 |
+
_hr()
|
| 108 |
+
if not entries:
|
| 109 |
+
print(" (no API-like entries survived filtering)")
|
| 110 |
+
for i, entry in enumerate(entries, 1):
|
| 111 |
+
_print_entry(i, entry)
|
| 112 |
+
print()
|
| 113 |
+
|
| 114 |
+
|
| 115 |
+
# ---------------------------------------------------------------------------
|
| 116 |
+
# JSON mode
|
| 117 |
+
# ---------------------------------------------------------------------------
|
| 118 |
+
|
| 119 |
+
def emit_json(results: dict) -> None:
|
| 120 |
+
# Convert to a JSON-safe structure
|
| 121 |
+
output = {}
|
| 122 |
+
for app_name, entries in results.items():
|
| 123 |
+
output[app_name] = {
|
| 124 |
+
"total": len(entries),
|
| 125 |
+
"method_counts": _method_counts(entries),
|
| 126 |
+
"endpoints": entries,
|
| 127 |
+
}
|
| 128 |
+
print(json.dumps(output, indent=2))
|
| 129 |
+
|
| 130 |
+
|
| 131 |
+
# ---------------------------------------------------------------------------
|
| 132 |
+
# Verification / assertion checks
|
| 133 |
+
# ---------------------------------------------------------------------------
|
| 134 |
+
|
| 135 |
+
|
| 136 |
+
# NOTE: These HAR files are sparse — each was recorded for a narrow task
|
| 137 |
+
# scenario, not as a full API crawl. The vast majority of HAR entries are
|
| 138 |
+
# static assets (/static/ prefix) that the extractor correctly filters out.
|
| 139 |
+
# Thresholds below reflect the actual usable API surface in each file.
|
| 140 |
+
SANITY_CHECKS: dict[str, dict] = {
|
| 141 |
+
"shopping": {
|
| 142 |
+
"min_endpoints": 1,
|
| 143 |
+
"expected_methods": {"GET"},
|
| 144 |
+
"note": "Sparse HAR — only checkout success page recorded; "
|
| 145 |
+
"213 total entries but 212 are /static/ assets.",
|
| 146 |
+
},
|
| 147 |
+
"shopping_admin": {
|
| 148 |
+
"min_endpoints": 2,
|
| 149 |
+
"expected_methods": {"GET", "POST"},
|
| 150 |
+
"note": "Sparse HAR — product save/edit + MUI JSON endpoint; "
|
| 151 |
+
"353 total entries but 350 are /static/ assets.",
|
| 152 |
+
},
|
| 153 |
+
"forum": {
|
| 154 |
+
"min_endpoints": 2,
|
| 155 |
+
"expected_methods": {"GET", "POST"},
|
| 156 |
+
"note": "Sparse HAR — one POST submission + one forum thread GET; "
|
| 157 |
+
"24 total entries but 22 are .js build files.",
|
| 158 |
+
},
|
| 159 |
+
"wikipedia": {
|
| 160 |
+
"min_endpoints": 0,
|
| 161 |
+
"expected_methods": set(),
|
| 162 |
+
"note": "Sparse HAR — only an article HTML page + /-/mw/ style/JS assets; "
|
| 163 |
+
"no XHR/REST traffic recorded.",
|
| 164 |
+
},
|
| 165 |
+
}
|
| 166 |
+
|
| 167 |
+
|
| 168 |
+
def run_checks(results: dict) -> bool:
|
| 169 |
+
print("\n" + "─" * _COL_W)
|
| 170 |
+
print("SANITY CHECKS (thresholds calibrated to actual HAR content)")
|
| 171 |
+
print("─" * _COL_W)
|
| 172 |
+
all_passed = True
|
| 173 |
+
for app_name, checks in SANITY_CHECKS.items():
|
| 174 |
+
entries = results.get(app_name, [])
|
| 175 |
+
methods_found = set(e["method"] for e in entries)
|
| 176 |
+
n = len(entries)
|
| 177 |
+
|
| 178 |
+
min_ok = n >= checks["min_endpoints"]
|
| 179 |
+
exp = checks["expected_methods"]
|
| 180 |
+
methods_ok = exp.issubset(methods_found) if exp else True
|
| 181 |
+
|
| 182 |
+
status = "PASS" if (min_ok and methods_ok) else "FAIL"
|
| 183 |
+
if status == "FAIL":
|
| 184 |
+
all_passed = False
|
| 185 |
+
|
| 186 |
+
print(f" {status} {app_name}")
|
| 187 |
+
print(f" endpoints : {n} (min={checks['min_endpoints']}) {'✓' if min_ok else '✗'}")
|
| 188 |
+
if exp:
|
| 189 |
+
print(f" methods : {sorted(methods_found)} "
|
| 190 |
+
f"(expected ⊇ {sorted(exp)}) {'✓' if methods_ok else '✗'}")
|
| 191 |
+
print(f" note : {checks['note']}")
|
| 192 |
+
print("─" * _COL_W)
|
| 193 |
+
print("Overall:", "ALL PASSED ✓" if all_passed else "SOME FAILED ✗")
|
| 194 |
+
return all_passed
|
| 195 |
+
|
| 196 |
+
|
| 197 |
+
# ---------------------------------------------------------------------------
|
| 198 |
+
# Main
|
| 199 |
+
# ---------------------------------------------------------------------------
|
| 200 |
+
|
| 201 |
+
def main() -> int:
|
| 202 |
+
emit_json_mode = "--json" in sys.argv
|
| 203 |
+
|
| 204 |
+
results: dict[str, list[dict]] = {}
|
| 205 |
+
raw_totals: dict[str, int] = {}
|
| 206 |
+
missing: list[str] = []
|
| 207 |
+
|
| 208 |
+
for app_name, har_path in HAR_FILES.items():
|
| 209 |
+
if not har_path.exists():
|
| 210 |
+
print(f"[WARN] HAR not found: {har_path}", file=sys.stderr)
|
| 211 |
+
missing.append(app_name)
|
| 212 |
+
results[app_name] = []
|
| 213 |
+
raw_totals[app_name] = 0
|
| 214 |
+
continue
|
| 215 |
+
|
| 216 |
+
with open(har_path) as f:
|
| 217 |
+
har_data = json.load(f)
|
| 218 |
+
|
| 219 |
+
raw_totals[app_name] = len(har_data.get("log", {}).get("entries", []))
|
| 220 |
+
entries = extract_openapi_spec(har_data, APP_BASE_URLS[app_name])
|
| 221 |
+
results[app_name] = entries
|
| 222 |
+
|
| 223 |
+
if emit_json_mode:
|
| 224 |
+
emit_json(results)
|
| 225 |
+
return 0
|
| 226 |
+
|
| 227 |
+
# Human-readable output
|
| 228 |
+
for app_name, entries in results.items():
|
| 229 |
+
print_app_summary(app_name, entries, raw_totals.get(app_name))
|
| 230 |
+
|
| 231 |
+
passed = run_checks(results)
|
| 232 |
+
|
| 233 |
+
if missing:
|
| 234 |
+
print(f"\n[WARN] Missing HAR files for: {', '.join(missing)}")
|
| 235 |
+
|
| 236 |
+
return 0 if passed else 1
|
| 237 |
+
|
| 238 |
+
|
| 239 |
+
if __name__ == "__main__":
|
| 240 |
+
sys.exit(main())
|
server/judge.py
CHANGED
|
@@ -140,23 +140,38 @@ def _get_curl_steps(episode: Episode):
|
|
| 140 |
def grade_template_1(episode: Episode, task: Task) -> float:
|
| 141 |
"""Easy — Shopping: List products in category {category_name}"""
|
| 142 |
category_name = task.params.get("category_name", "")
|
|
|
|
| 143 |
|
| 144 |
for step in _get_curl_steps(episode):
|
| 145 |
cp = step.curl_parsed
|
| 146 |
if cp.status_code == 200:
|
| 147 |
body = cp.response_body
|
|
|
|
| 148 |
if isinstance(body, dict) and "items" in body:
|
| 149 |
items = body["items"]
|
| 150 |
if len(items) > 0:
|
| 151 |
-
# Check if any item mentions the category
|
| 152 |
for item in items:
|
| 153 |
if _item_matches_category(item, category_name):
|
| 154 |
return 1.0
|
| 155 |
-
# Items returned but can't verify category — partial
|
| 156 |
return 0.3
|
| 157 |
-
#
|
| 158 |
if isinstance(body, list) and len(body) > 0:
|
| 159 |
return 0.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 160 |
|
| 161 |
return 0.0
|
| 162 |
|
|
@@ -220,14 +235,14 @@ def grade_template_3(episode: Episode, task: Task) -> float:
|
|
| 220 |
"""Medium — Shopping: Add {product_name} to a guest cart"""
|
| 221 |
product_name = task.params.get("product_name", "")
|
| 222 |
sku = task.params.get("sku")
|
|
|
|
| 223 |
|
| 224 |
-
# Primary: check if add-to-cart responded with item_id
|
| 225 |
for step in _get_curl_steps(episode):
|
| 226 |
cp = step.curl_parsed
|
| 227 |
if cp.status_code == 200:
|
| 228 |
body = cp.response_body
|
| 229 |
if isinstance(body, dict) and "item_id" in body:
|
| 230 |
-
# Verify the sku if we have it
|
| 231 |
if sku and body.get("sku") == sku:
|
| 232 |
return 1.0
|
| 233 |
if _fuzzy_match(str(body.get("name", "")), product_name):
|
|
@@ -235,7 +250,29 @@ def grade_template_3(episode: Episode, task: Task) -> float:
|
|
| 235 |
if body.get("item_id"):
|
| 236 |
return 1.0
|
| 237 |
|
| 238 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 239 |
cart_id = _extract_cart_id(episode)
|
| 240 |
if cart_id:
|
| 241 |
probe = _judge_probe(f"/rest/V1/guest-carts/{cart_id}", task.base_url)
|
|
@@ -247,13 +284,13 @@ def grade_template_3(episode: Episode, task: Task) -> float:
|
|
| 247 |
if _fuzzy_match(str(item.get("name", "")), product_name):
|
| 248 |
return 1.0
|
| 249 |
if len(items) == 0:
|
| 250 |
-
return 0.2 # cart created, item not added
|
| 251 |
|
| 252 |
-
# Partial: cart was created
|
| 253 |
if cart_id:
|
| 254 |
return 0.2
|
| 255 |
|
| 256 |
-
# Partial: attempted cart creation
|
| 257 |
if any("guest-carts" in (s.curl_parsed.path or "") and
|
| 258 |
s.curl_parsed.method == "POST"
|
| 259 |
for s in _get_curl_steps(episode)):
|
|
@@ -424,7 +461,7 @@ def grade_template_6(episode: Episode, task: Task) -> float:
|
|
| 424 |
|
| 425 |
|
| 426 |
def _extract_admin_token(episode: Episode) -> str | None:
|
| 427 |
-
"""Find admin bearer token from
|
| 428 |
for step in _get_curl_steps(episode):
|
| 429 |
cp = step.curl_parsed
|
| 430 |
if cp.status_code == 200 and "integration/admin/token" in cp.path:
|
|
@@ -434,6 +471,49 @@ def _extract_admin_token(episode: Episode) -> str | None:
|
|
| 434 |
return None
|
| 435 |
|
| 436 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 437 |
def _attempted_product_creation(episode: Episode, sku: str) -> bool:
|
| 438 |
"""Check if the model attempted to create a product with this SKU."""
|
| 439 |
for step in _get_curl_steps(episode):
|
|
@@ -666,12 +746,15 @@ def evaluate(episode: Episode) -> EpisodeResult:
|
|
| 666 |
|
| 667 |
task_score = grader(episode, task)
|
| 668 |
param_score = verify_parameter_sourcing(episode, task)
|
| 669 |
-
auth_obtained =
|
| 670 |
|
| 671 |
# Compute reward
|
| 672 |
reward = _score_to_reward(task_score, template_id)
|
| 673 |
|
| 674 |
-
#
|
|
|
|
|
|
|
|
|
|
| 675 |
if task_score < 0.5 and auth_obtained:
|
| 676 |
reward = max(reward, AUTH_BONUS)
|
| 677 |
|
|
|
|
| 140 |
def grade_template_1(episode: Episode, task: Task) -> float:
|
| 141 |
"""Easy — Shopping: List products in category {category_name}"""
|
| 142 |
category_name = task.params.get("category_name", "")
|
| 143 |
+
category_lower = category_name.lower()
|
| 144 |
|
| 145 |
for step in _get_curl_steps(episode):
|
| 146 |
cp = step.curl_parsed
|
| 147 |
if cp.status_code == 200:
|
| 148 |
body = cp.response_body
|
| 149 |
+
# REST API JSON response (ideal path: /rest/V1/products)
|
| 150 |
if isinstance(body, dict) and "items" in body:
|
| 151 |
items = body["items"]
|
| 152 |
if len(items) > 0:
|
|
|
|
| 153 |
for item in items:
|
| 154 |
if _item_matches_category(item, category_name):
|
| 155 |
return 1.0
|
|
|
|
| 156 |
return 0.3
|
| 157 |
+
# Raw list
|
| 158 |
if isinstance(body, list) and len(body) > 0:
|
| 159 |
return 0.3
|
| 160 |
+
# Distilled HTML page (from html_distiller) — check for search results page
|
| 161 |
+
# that contains product forms. page_type/forms/text are the distiller's keys.
|
| 162 |
+
if isinstance(body, dict) and "page_type" in body:
|
| 163 |
+
forms = body.get("forms", [])
|
| 164 |
+
text = body.get("text", "") or ""
|
| 165 |
+
title = (body.get("title") or "").lower()
|
| 166 |
+
# A search/category results page has multiple POST add-to-cart forms
|
| 167 |
+
product_forms = [f for f in forms if f.get("method") == "POST"
|
| 168 |
+
and "product" in f.get("fields", {})]
|
| 169 |
+
if product_forms:
|
| 170 |
+
# Check that the page is about the requested category
|
| 171 |
+
if category_lower in title or category_lower in text.lower():
|
| 172 |
+
return 1.0
|
| 173 |
+
# Products listed but category name not verifiable from title — partial
|
| 174 |
+
return 0.5
|
| 175 |
|
| 176 |
return 0.0
|
| 177 |
|
|
|
|
| 235 |
"""Medium — Shopping: Add {product_name} to a guest cart"""
|
| 236 |
product_name = task.params.get("product_name", "")
|
| 237 |
sku = task.params.get("sku")
|
| 238 |
+
product_id = str(task.params.get("product_id", ""))
|
| 239 |
|
| 240 |
+
# Primary: REST API — check if add-to-cart responded with item_id
|
| 241 |
for step in _get_curl_steps(episode):
|
| 242 |
cp = step.curl_parsed
|
| 243 |
if cp.status_code == 200:
|
| 244 |
body = cp.response_body
|
| 245 |
if isinstance(body, dict) and "item_id" in body:
|
|
|
|
| 246 |
if sku and body.get("sku") == sku:
|
| 247 |
return 1.0
|
| 248 |
if _fuzzy_match(str(body.get("name", "")), product_name):
|
|
|
|
| 250 |
if body.get("item_id"):
|
| 251 |
return 1.0
|
| 252 |
|
| 253 |
+
# Secondary: HTML form-based add-to-cart (POST to /checkout/cart/add)
|
| 254 |
+
# A 302 redirect or 200 response from this endpoint means item was accepted
|
| 255 |
+
for step in _get_curl_steps(episode):
|
| 256 |
+
cp = step.curl_parsed
|
| 257 |
+
if cp.method == "POST" and "/checkout/cart/add" in (cp.path or ""):
|
| 258 |
+
if cp.status_code in (200, 302):
|
| 259 |
+
# Optionally verify the correct product_id was posted
|
| 260 |
+
body_str = str(cp.body or "")
|
| 261 |
+
correct_product = (not product_id) or (product_id in body_str)
|
| 262 |
+
|
| 263 |
+
# Probe cart to confirm item presence
|
| 264 |
+
probe = _judge_probe("/checkout/cart/", task.base_url)
|
| 265 |
+
if probe and probe.status_code == 200:
|
| 266 |
+
cart_text = (probe.body if isinstance(probe.body, str) else str(probe.body)).lower()
|
| 267 |
+
# Cart page mentions product name or has quantity indicators
|
| 268 |
+
if product_name.lower()[:15] in cart_text:
|
| 269 |
+
return 1.0
|
| 270 |
+
if "qty" in cart_text or "quantity" in cart_text or "item" in cart_text:
|
| 271 |
+
return 0.8 if correct_product else 0.6
|
| 272 |
+
# POST succeeded without cart confirmation
|
| 273 |
+
return 0.7 if correct_product else 0.5
|
| 274 |
+
|
| 275 |
+
# Try live probe via REST guest-cart
|
| 276 |
cart_id = _extract_cart_id(episode)
|
| 277 |
if cart_id:
|
| 278 |
probe = _judge_probe(f"/rest/V1/guest-carts/{cart_id}", task.base_url)
|
|
|
|
| 284 |
if _fuzzy_match(str(item.get("name", "")), product_name):
|
| 285 |
return 1.0
|
| 286 |
if len(items) == 0:
|
| 287 |
+
return 0.2 # cart created, item not added yet
|
| 288 |
|
| 289 |
+
# Partial: REST cart was created
|
| 290 |
if cart_id:
|
| 291 |
return 0.2
|
| 292 |
|
| 293 |
+
# Partial: attempted cart creation via REST
|
| 294 |
if any("guest-carts" in (s.curl_parsed.path or "") and
|
| 295 |
s.curl_parsed.method == "POST"
|
| 296 |
for s in _get_curl_steps(episode)):
|
|
|
|
| 461 |
|
| 462 |
|
| 463 |
def _extract_admin_token(episode: Episode) -> str | None:
|
| 464 |
+
"""Find admin bearer token from shopping-admin trajectory (used by graders)."""
|
| 465 |
for step in _get_curl_steps(episode):
|
| 466 |
cp = step.curl_parsed
|
| 467 |
if cp.status_code == 200 and "integration/admin/token" in cp.path:
|
|
|
|
| 471 |
return None
|
| 472 |
|
| 473 |
|
| 474 |
+
def _check_any_auth_obtained(episode: Episode) -> bool:
|
| 475 |
+
"""
|
| 476 |
+
Generic check: did the agent successfully obtain ANY form of authentication?
|
| 477 |
+
|
| 478 |
+
Detects:
|
| 479 |
+
- Forum/CSRF token authentication
|
| 480 |
+
- Shopping-admin integration token
|
| 481 |
+
- Any 200 response returning a bare token string (bearer, user token, API key)
|
| 482 |
+
- Any 200 response returning a dict with a token field (access_token, id_token, etc.)
|
| 483 |
+
|
| 484 |
+
Application-agnostic — the model discovers auth endpoints via browser_agent /
|
| 485 |
+
search_endpoints; this simply rewards the intermediate step of obtaining auth.
|
| 486 |
+
"""
|
| 487 |
+
# Forum/CSRF auth
|
| 488 |
+
if _check_forum_auth(episode):
|
| 489 |
+
return True
|
| 490 |
+
|
| 491 |
+
# Shopping admin token
|
| 492 |
+
if _extract_admin_token(episode):
|
| 493 |
+
return True
|
| 494 |
+
|
| 495 |
+
# Generic: any successful response that looks like it returned an auth token
|
| 496 |
+
for step in _get_curl_steps(episode):
|
| 497 |
+
cp = step.curl_parsed
|
| 498 |
+
if cp.status_code != 200:
|
| 499 |
+
continue
|
| 500 |
+
body = cp.response_body
|
| 501 |
+
|
| 502 |
+
# Plain string token (e.g. Magento user/guest tokens, API keys)
|
| 503 |
+
if isinstance(body, str):
|
| 504 |
+
stripped = body.strip().strip('"')
|
| 505 |
+
if re.fullmatch(r"[A-Za-z0-9+/=_\-\.]{20,}", stripped):
|
| 506 |
+
return True
|
| 507 |
+
|
| 508 |
+
# Dict with a recognised token field
|
| 509 |
+
if isinstance(body, dict):
|
| 510 |
+
for k in ("token", "access_token", "id_token", "auth_token", "bearer"):
|
| 511 |
+
if k in body and isinstance(body[k], str) and len(body[k]) > 10:
|
| 512 |
+
return True
|
| 513 |
+
|
| 514 |
+
return False
|
| 515 |
+
|
| 516 |
+
|
| 517 |
def _attempted_product_creation(episode: Episode, sku: str) -> bool:
|
| 518 |
"""Check if the model attempted to create a product with this SKU."""
|
| 519 |
for step in _get_curl_steps(episode):
|
|
|
|
| 746 |
|
| 747 |
task_score = grader(episode, task)
|
| 748 |
param_score = verify_parameter_sourcing(episode, task)
|
| 749 |
+
auth_obtained = _check_any_auth_obtained(episode)
|
| 750 |
|
| 751 |
# Compute reward
|
| 752 |
reward = _score_to_reward(task_score, template_id)
|
| 753 |
|
| 754 |
+
# Auth bonus: if the task failed but the agent successfully obtained any form
|
| 755 |
+
# of authentication (bearer token, session cookie, CSRF token, etc.), floor
|
| 756 |
+
# the reward at AUTH_BONUS. This is application-agnostic — obtaining auth is
|
| 757 |
+
# a useful intermediate skill regardless of the specific task template.
|
| 758 |
if task_score < 0.5 and auth_obtained:
|
| 759 |
reward = max(reward, AUTH_BONUS)
|
| 760 |
|
server/models.py
CHANGED
|
@@ -75,6 +75,8 @@ REWARD_NEW_PATH = 0.1 # curl path not seen before this episode
|
|
| 75 |
REWARD_CORRECT_PARAM = 0.25 # judge: correct parameter sourcing (applied at end)
|
| 76 |
REWARD_SESSION_VALUE = 0.1 # auth token/cookie correctly used
|
| 77 |
PENALTY_REPEATED_CALL = -0.15 # exact duplicate curl command
|
|
|
|
|
|
|
| 78 |
PENALTY_BROWSER_AGENT_AGAIN = -0.3 # browser_agent called after step 1
|
| 79 |
PENALTY_MALFORMED_CURL = -0.1 # curl can't be parsed/executed
|
| 80 |
PENALTY_4XX = -0.05 # recoverable HTTP error
|
|
@@ -103,14 +105,49 @@ TASK_NAME_TO_TEMPLATE = {
|
|
| 103 |
"har_pipeline_hard": 6,
|
| 104 |
}
|
| 105 |
|
| 106 |
-
TEMPLATE_DESCRIPTIONS = {
|
| 107 |
-
1:
|
| 108 |
-
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
|
| 113 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 114 |
}
|
| 115 |
|
| 116 |
|
|
@@ -139,7 +176,7 @@ def _sample_task(template_id: int, parameter_pools: dict) -> tuple[str, dict, st
|
|
| 139 |
items = pool.get("category_name", [{"name": "Gear", "category_id": 3}])
|
| 140 |
chosen = random.choice(items)
|
| 141 |
params = {"category_name": chosen["name"], "category_id": chosen.get("category_id")}
|
| 142 |
-
description = TEMPLATE_DESCRIPTIONS[1].format(**params)
|
| 143 |
|
| 144 |
elif template_id == 2:
|
| 145 |
items = pool.get("title", [{"title": "Python (programming language)", "expected_slug": "Python_(programming_language)"}])
|
|
@@ -148,7 +185,7 @@ def _sample_task(template_id: int, parameter_pools: dict) -> tuple[str, dict, st
|
|
| 148 |
chosen = random.choice(items)
|
| 149 |
title = chosen.get("title", chosen) if isinstance(chosen, dict) else chosen
|
| 150 |
params = {"title": title, "expected_slug": chosen.get("expected_slug", title.replace(" ", "_"))}
|
| 151 |
-
description = TEMPLATE_DESCRIPTIONS[2].format(**params)
|
| 152 |
|
| 153 |
elif template_id == 3:
|
| 154 |
items = pool.get("product_name", [{"name": "Radiant Tee", "sku": "MH01"}])
|
|
@@ -157,8 +194,11 @@ def _sample_task(template_id: int, parameter_pools: dict) -> tuple[str, dict, st
|
|
| 157 |
chosen = random.choice(items)
|
| 158 |
product_name = chosen.get("name", chosen) if isinstance(chosen, dict) else chosen
|
| 159 |
sku = chosen.get("sku", "") if isinstance(chosen, dict) else ""
|
|
|
|
| 160 |
params = {"product_name": product_name, "sku": sku}
|
| 161 |
-
|
|
|
|
|
|
|
| 162 |
|
| 163 |
elif template_id == 4:
|
| 164 |
items = pool.get("forum_category", [{"slug": "general", "name": "General"}])
|
|
@@ -167,7 +207,7 @@ def _sample_task(template_id: int, parameter_pools: dict) -> tuple[str, dict, st
|
|
| 167 |
chosen = random.choice(items)
|
| 168 |
forum_cat = chosen.get("slug", chosen.get("name", "general")) if isinstance(chosen, dict) else chosen
|
| 169 |
params = {"forum_category": forum_cat}
|
| 170 |
-
description = TEMPLATE_DESCRIPTIONS[4].format(**params)
|
| 171 |
|
| 172 |
elif template_id == 5:
|
| 173 |
categories = pool.get("forum_category", [{"slug": "general"}])
|
|
@@ -180,7 +220,7 @@ def _sample_task(template_id: int, parameter_pools: dict) -> tuple[str, dict, st
|
|
| 180 |
chosen_title = random.choice(titles) if isinstance(titles[0], str) else random.choice(titles).get("title", "Test post")
|
| 181 |
forum_cat = chosen_cat.get("slug", "general") if isinstance(chosen_cat, dict) else chosen_cat
|
| 182 |
params = {"title": chosen_title, "category": forum_cat}
|
| 183 |
-
description = TEMPLATE_DESCRIPTIONS[5].format(**params)
|
| 184 |
|
| 185 |
elif template_id == 6:
|
| 186 |
items = pool.get("product_name", [{"name": "Radiant Tee", "sku": "MH01"}])
|
|
@@ -189,8 +229,11 @@ def _sample_task(template_id: int, parameter_pools: dict) -> tuple[str, dict, st
|
|
| 189 |
chosen = random.choice(items)
|
| 190 |
product_name = chosen.get("name", chosen) if isinstance(chosen, dict) else chosen
|
| 191 |
sku = chosen.get("sku", "") if isinstance(chosen, dict) else ""
|
|
|
|
| 192 |
params = {"product_name": product_name, "sku": sku}
|
| 193 |
-
|
|
|
|
|
|
|
| 194 |
|
| 195 |
elif template_id == 7:
|
| 196 |
items = pool.get("admin_sku", [{"sku": "HAR-TEST-001", "price": "29.99"}])
|
|
@@ -200,7 +243,7 @@ def _sample_task(template_id: int, parameter_pools: dict) -> tuple[str, dict, st
|
|
| 200 |
sku = chosen.get("sku", "HAR-TEST-001") if isinstance(chosen, dict) else chosen
|
| 201 |
price = str(chosen.get("price", "29.99")) if isinstance(chosen, dict) else "29.99"
|
| 202 |
params = {"sku": sku, "price": price}
|
| 203 |
-
description = TEMPLATE_DESCRIPTIONS[7].format(**params)
|
| 204 |
|
| 205 |
else:
|
| 206 |
params = {}
|
|
@@ -211,6 +254,19 @@ def _sample_task(template_id: int, parameter_pools: dict) -> tuple[str, dict, st
|
|
| 211 |
return description, params, base_url
|
| 212 |
|
| 213 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 214 |
# ---------------------------------------------------------------------------
|
| 215 |
# Environment
|
| 216 |
# ---------------------------------------------------------------------------
|
|
@@ -235,6 +291,7 @@ class HARvestGymEnvironment(Environment):
|
|
| 235 |
self._episode_store: dict = {} # embeddings, BM25 corpus, etc.
|
| 236 |
self._called_paths: set = set() # for new-path reward
|
| 237 |
self._last_curl_commands: list = [] # for duplicate detection
|
|
|
|
| 238 |
self._step_rewards: list[float] = []
|
| 239 |
self._done = False
|
| 240 |
|
|
@@ -297,6 +354,12 @@ class HARvestGymEnvironment(Environment):
|
|
| 297 |
task_name = self._task_name
|
| 298 |
if task_name in TASK_NAME_TO_TEMPLATE:
|
| 299 |
return TASK_NAME_TO_TEMPLATE[task_name]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 300 |
# Try integer
|
| 301 |
try:
|
| 302 |
tid = int(task_name)
|
|
@@ -310,17 +373,30 @@ class HARvestGymEnvironment(Environment):
|
|
| 310 |
"""Reset environment: clear episode state, sample new task."""
|
| 311 |
from .episode import Episode, Task
|
| 312 |
|
| 313 |
-
|
| 314 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 315 |
|
| 316 |
-
meta = TEMPLATE_META[template_id]
|
| 317 |
self._current_task = Task(
|
| 318 |
template_id=template_id,
|
| 319 |
description=description,
|
| 320 |
params=params,
|
| 321 |
-
app=
|
| 322 |
base_url=base_url,
|
| 323 |
-
difficulty=
|
| 324 |
)
|
| 325 |
|
| 326 |
self._episode = Episode(task=self._current_task)
|
|
@@ -328,6 +404,7 @@ class HARvestGymEnvironment(Environment):
|
|
| 328 |
self._episode_store = {}
|
| 329 |
self._called_paths = set()
|
| 330 |
self._last_curl_commands = []
|
|
|
|
| 331 |
self._step_rewards = []
|
| 332 |
self._done = False
|
| 333 |
self._state = State(episode_id=str(uuid4()), step_count=0)
|
|
@@ -344,8 +421,8 @@ class HARvestGymEnvironment(Environment):
|
|
| 344 |
reward=0.0,
|
| 345 |
metadata={
|
| 346 |
"template_id": template_id,
|
| 347 |
-
"difficulty":
|
| 348 |
-
"app":
|
| 349 |
},
|
| 350 |
)
|
| 351 |
|
|
@@ -397,7 +474,9 @@ class HARvestGymEnvironment(Environment):
|
|
| 397 |
headers=parsed["headers"],
|
| 398 |
body=parsed["body"],
|
| 399 |
status_code=resp.get("status_code", 0),
|
| 400 |
-
|
|
|
|
|
|
|
| 401 |
response_headers=resp.get("headers", {}),
|
| 402 |
)
|
| 403 |
except Exception:
|
|
@@ -502,18 +581,33 @@ class HARvestGymEnvironment(Environment):
|
|
| 502 |
reward += PENALTY_MALFORMED_CURL
|
| 503 |
elif 200 <= status < 300:
|
| 504 |
reward += REWARD_VALID_API_CALL
|
| 505 |
-
# New path bonus
|
| 506 |
from urllib.parse import urlparse
|
| 507 |
from .tools.browser_agent import _normalise_path
|
| 508 |
try:
|
| 509 |
-
|
| 510 |
-
|
| 511 |
-
|
| 512 |
-
|
| 513 |
-
|
| 514 |
-
|
| 515 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 516 |
break
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 517 |
except Exception:
|
| 518 |
pass
|
| 519 |
elif 400 <= status < 500:
|
|
|
|
| 75 |
REWARD_CORRECT_PARAM = 0.25 # judge: correct parameter sourcing (applied at end)
|
| 76 |
REWARD_SESSION_VALUE = 0.1 # auth token/cookie correctly used
|
| 77 |
PENALTY_REPEATED_CALL = -0.15 # exact duplicate curl command
|
| 78 |
+
PENALTY_REPEATED_DIFF_PARAM_CALL = -0.05 # duplicate curl but with different parameters
|
| 79 |
+
PENALTY_REPEATED_PATH = -0.15 # same (method, normalised path) called more than once
|
| 80 |
PENALTY_BROWSER_AGENT_AGAIN = -0.3 # browser_agent called after step 1
|
| 81 |
PENALTY_MALFORMED_CURL = -0.1 # curl can't be parsed/executed
|
| 82 |
PENALTY_4XX = -0.05 # recoverable HTTP error
|
|
|
|
| 105 |
"har_pipeline_hard": 6,
|
| 106 |
}
|
| 107 |
|
| 108 |
+
TEMPLATE_DESCRIPTIONS: dict[int, list[str]] = {
|
| 109 |
+
1: [
|
| 110 |
+
"List products in category {category_name}",
|
| 111 |
+
"Show all products under the {category_name} category",
|
| 112 |
+
"Fetch the product listing for the '{category_name}' category",
|
| 113 |
+
"What products are available in the {category_name} category?",
|
| 114 |
+
],
|
| 115 |
+
2: [
|
| 116 |
+
"Retrieve the Wikipedia article for '{title}'",
|
| 117 |
+
"Fetch the Wikipedia page about '{title}'",
|
| 118 |
+
"Get the Wikipedia entry for '{title}'",
|
| 119 |
+
"Look up '{title}' on Wikipedia and return the article",
|
| 120 |
+
],
|
| 121 |
+
3: [
|
| 122 |
+
"Find '{product_name}' in the store and add it to the shopping cart",
|
| 123 |
+
"Add '{product_name}' to the cart",
|
| 124 |
+
"Shop for '{product_name}' and put it in the cart",
|
| 125 |
+
"I want to buy '{product_name}' — add it to my cart",
|
| 126 |
+
],
|
| 127 |
+
4: [
|
| 128 |
+
"Retrieve all posts in the '{forum_category}' forum (you must log in first)",
|
| 129 |
+
"Fetch the post list for the '{forum_category}' forum category",
|
| 130 |
+
"Get all threads in the '{forum_category}' forum section",
|
| 131 |
+
"List the forum posts under '{forum_category}' (authentication required)",
|
| 132 |
+
],
|
| 133 |
+
5: [
|
| 134 |
+
"Create a post titled '{title}' in the '{category}' forum. Note: authentication is required.",
|
| 135 |
+
"Post a new thread called '{title}' in the '{category}' forum",
|
| 136 |
+
"Submit a forum post with the title '{title}' to the '{category}' section",
|
| 137 |
+
"Publish '{title}' as a new post in the '{category}' forum",
|
| 138 |
+
],
|
| 139 |
+
6: [
|
| 140 |
+
"Complete a full guest checkout for '{product_name}'. The checkout involves multiple dependent steps — each step produces a value needed by the next. The task is complete when a confirmed order is placed.",
|
| 141 |
+
"Place a guest order for '{product_name}'. The process spans several API calls that build on each other; you are done when an order confirmation is received.",
|
| 142 |
+
"Buy '{product_name}' as a guest user and complete the checkout. Each stage of the checkout requires information returned by the previous stage.",
|
| 143 |
+
"Finish a guest checkout for '{product_name}'. Work through each step in sequence — the output of every step feeds into the next — until the order is confirmed.",
|
| 144 |
+
],
|
| 145 |
+
7: [
|
| 146 |
+
"Create a new product in the admin panel with SKU '{sku}' and price {price}. Admin access is required.",
|
| 147 |
+
"Add a product to the catalog via the admin interface: SKU '{sku}', price {price}",
|
| 148 |
+
"As an admin, create a new product listing with SKU '{sku}' priced at {price}",
|
| 149 |
+
"Use admin credentials to create a product with SKU '{sku}' and a price of {price}",
|
| 150 |
+
],
|
| 151 |
}
|
| 152 |
|
| 153 |
|
|
|
|
| 176 |
items = pool.get("category_name", [{"name": "Gear", "category_id": 3}])
|
| 177 |
chosen = random.choice(items)
|
| 178 |
params = {"category_name": chosen["name"], "category_id": chosen.get("category_id")}
|
| 179 |
+
description = random.choice(TEMPLATE_DESCRIPTIONS[1]).format(**params)
|
| 180 |
|
| 181 |
elif template_id == 2:
|
| 182 |
items = pool.get("title", [{"title": "Python (programming language)", "expected_slug": "Python_(programming_language)"}])
|
|
|
|
| 185 |
chosen = random.choice(items)
|
| 186 |
title = chosen.get("title", chosen) if isinstance(chosen, dict) else chosen
|
| 187 |
params = {"title": title, "expected_slug": chosen.get("expected_slug", title.replace(" ", "_"))}
|
| 188 |
+
description = random.choice(TEMPLATE_DESCRIPTIONS[2]).format(**params)
|
| 189 |
|
| 190 |
elif template_id == 3:
|
| 191 |
items = pool.get("product_name", [{"name": "Radiant Tee", "sku": "MH01"}])
|
|
|
|
| 194 |
chosen = random.choice(items)
|
| 195 |
product_name = chosen.get("name", chosen) if isinstance(chosen, dict) else chosen
|
| 196 |
sku = chosen.get("sku", "") if isinstance(chosen, dict) else ""
|
| 197 |
+
product_id = chosen.get("product_id") if isinstance(chosen, dict) else None
|
| 198 |
params = {"product_name": product_name, "sku": sku}
|
| 199 |
+
if product_id:
|
| 200 |
+
params["product_id"] = product_id
|
| 201 |
+
description = random.choice(TEMPLATE_DESCRIPTIONS[3]).format(**params)
|
| 202 |
|
| 203 |
elif template_id == 4:
|
| 204 |
items = pool.get("forum_category", [{"slug": "general", "name": "General"}])
|
|
|
|
| 207 |
chosen = random.choice(items)
|
| 208 |
forum_cat = chosen.get("slug", chosen.get("name", "general")) if isinstance(chosen, dict) else chosen
|
| 209 |
params = {"forum_category": forum_cat}
|
| 210 |
+
description = random.choice(TEMPLATE_DESCRIPTIONS[4]).format(**params)
|
| 211 |
|
| 212 |
elif template_id == 5:
|
| 213 |
categories = pool.get("forum_category", [{"slug": "general"}])
|
|
|
|
| 220 |
chosen_title = random.choice(titles) if isinstance(titles[0], str) else random.choice(titles).get("title", "Test post")
|
| 221 |
forum_cat = chosen_cat.get("slug", "general") if isinstance(chosen_cat, dict) else chosen_cat
|
| 222 |
params = {"title": chosen_title, "category": forum_cat}
|
| 223 |
+
description = random.choice(TEMPLATE_DESCRIPTIONS[5]).format(**params)
|
| 224 |
|
| 225 |
elif template_id == 6:
|
| 226 |
items = pool.get("product_name", [{"name": "Radiant Tee", "sku": "MH01"}])
|
|
|
|
| 229 |
chosen = random.choice(items)
|
| 230 |
product_name = chosen.get("name", chosen) if isinstance(chosen, dict) else chosen
|
| 231 |
sku = chosen.get("sku", "") if isinstance(chosen, dict) else ""
|
| 232 |
+
product_id = chosen.get("product_id") if isinstance(chosen, dict) else None
|
| 233 |
params = {"product_name": product_name, "sku": sku}
|
| 234 |
+
if product_id:
|
| 235 |
+
params["product_id"] = product_id
|
| 236 |
+
description = random.choice(TEMPLATE_DESCRIPTIONS[6]).format(**params)
|
| 237 |
|
| 238 |
elif template_id == 7:
|
| 239 |
items = pool.get("admin_sku", [{"sku": "HAR-TEST-001", "price": "29.99"}])
|
|
|
|
| 243 |
sku = chosen.get("sku", "HAR-TEST-001") if isinstance(chosen, dict) else chosen
|
| 244 |
price = str(chosen.get("price", "29.99")) if isinstance(chosen, dict) else "29.99"
|
| 245 |
params = {"sku": sku, "price": price}
|
| 246 |
+
description = random.choice(TEMPLATE_DESCRIPTIONS[7]).format(**params)
|
| 247 |
|
| 248 |
else:
|
| 249 |
params = {}
|
|
|
|
| 254 |
return description, params, base_url
|
| 255 |
|
| 256 |
|
| 257 |
+
def _load_fixed_task_from_env() -> dict | None:
|
| 258 |
+
"""Load an exact task specification when the caller wants deterministic reset()."""
|
| 259 |
+
raw = os.environ.get("HARVGYM_TASK_SPEC_JSON", "").strip()
|
| 260 |
+
if not raw:
|
| 261 |
+
return None
|
| 262 |
+
try:
|
| 263 |
+
parsed = json.loads(raw)
|
| 264 |
+
except json.JSONDecodeError:
|
| 265 |
+
print("[HARvestGym] Ignoring invalid HARVGYM_TASK_SPEC_JSON", flush=True)
|
| 266 |
+
return None
|
| 267 |
+
return parsed if isinstance(parsed, dict) else None
|
| 268 |
+
|
| 269 |
+
|
| 270 |
# ---------------------------------------------------------------------------
|
| 271 |
# Environment
|
| 272 |
# ---------------------------------------------------------------------------
|
|
|
|
| 291 |
self._episode_store: dict = {} # embeddings, BM25 corpus, etc.
|
| 292 |
self._called_paths: set = set() # for new-path reward
|
| 293 |
self._last_curl_commands: list = [] # for duplicate detection
|
| 294 |
+
self._called_methods_paths: list[tuple[str, str]] = [] # for same-path penalty
|
| 295 |
self._step_rewards: list[float] = []
|
| 296 |
self._done = False
|
| 297 |
|
|
|
|
| 354 |
task_name = self._task_name
|
| 355 |
if task_name in TASK_NAME_TO_TEMPLATE:
|
| 356 |
return TASK_NAME_TO_TEMPLATE[task_name]
|
| 357 |
+
if task_name.startswith("easy_"):
|
| 358 |
+
return 1
|
| 359 |
+
if task_name.startswith("medium_"):
|
| 360 |
+
return 3
|
| 361 |
+
if task_name.startswith("hard_"):
|
| 362 |
+
return 6
|
| 363 |
# Try integer
|
| 364 |
try:
|
| 365 |
tid = int(task_name)
|
|
|
|
| 373 |
"""Reset environment: clear episode state, sample new task."""
|
| 374 |
from .episode import Episode, Task
|
| 375 |
|
| 376 |
+
fixed_task = _load_fixed_task_from_env()
|
| 377 |
+
|
| 378 |
+
if fixed_task:
|
| 379 |
+
template_id = int(fixed_task.get("template_id", self._get_template_id()))
|
| 380 |
+
meta = TEMPLATE_META.get(template_id, TEMPLATE_META[self._get_template_id()])
|
| 381 |
+
params = dict(fixed_task.get("params") or {})
|
| 382 |
+
description = fixed_task.get("description") or TEMPLATE_DESCRIPTIONS[template_id].format(**params)
|
| 383 |
+
base_url = fixed_task.get("base_url") or f"http://{EC2_HOST}:{meta['base_url_port']}/"
|
| 384 |
+
difficulty = fixed_task.get("difficulty") or meta["tier"]
|
| 385 |
+
app = fixed_task.get("app") or meta["app"]
|
| 386 |
+
else:
|
| 387 |
+
template_id = self._get_template_id()
|
| 388 |
+
description, params, base_url = _sample_task(template_id, self._parameter_pools)
|
| 389 |
+
meta = TEMPLATE_META[template_id]
|
| 390 |
+
difficulty = meta["tier"]
|
| 391 |
+
app = meta["app"]
|
| 392 |
|
|
|
|
| 393 |
self._current_task = Task(
|
| 394 |
template_id=template_id,
|
| 395 |
description=description,
|
| 396 |
params=params,
|
| 397 |
+
app=app,
|
| 398 |
base_url=base_url,
|
| 399 |
+
difficulty=difficulty,
|
| 400 |
)
|
| 401 |
|
| 402 |
self._episode = Episode(task=self._current_task)
|
|
|
|
| 404 |
self._episode_store = {}
|
| 405 |
self._called_paths = set()
|
| 406 |
self._last_curl_commands = []
|
| 407 |
+
self._called_methods_paths = []
|
| 408 |
self._step_rewards = []
|
| 409 |
self._done = False
|
| 410 |
self._state = State(episode_id=str(uuid4()), step_count=0)
|
|
|
|
| 421 |
reward=0.0,
|
| 422 |
metadata={
|
| 423 |
"template_id": template_id,
|
| 424 |
+
"difficulty": difficulty,
|
| 425 |
+
"app": app,
|
| 426 |
},
|
| 427 |
)
|
| 428 |
|
|
|
|
| 474 |
headers=parsed["headers"],
|
| 475 |
body=parsed["body"],
|
| 476 |
status_code=resp.get("status_code", 0),
|
| 477 |
+
# Use _judge_body (full structured body) for judge grading;
|
| 478 |
+
# falls back to body (truncated) if not present
|
| 479 |
+
response_body=resp.get("_judge_body", resp.get("body")),
|
| 480 |
response_headers=resp.get("headers", {}),
|
| 481 |
)
|
| 482 |
except Exception:
|
|
|
|
| 581 |
reward += PENALTY_MALFORMED_CURL
|
| 582 |
elif 200 <= status < 300:
|
| 583 |
reward += REWARD_VALID_API_CALL
|
| 584 |
+
# New path bonus + same-path penalty
|
| 585 |
from urllib.parse import urlparse
|
| 586 |
from .tools.browser_agent import _normalise_path
|
| 587 |
try:
|
| 588 |
+
import shlex as _shlex
|
| 589 |
+
# Extract HTTP method (-X flag or infer from data flags)
|
| 590 |
+
_tokens = _shlex.split(command)
|
| 591 |
+
_method = "GET"
|
| 592 |
+
for _i, _tok in enumerate(_tokens):
|
| 593 |
+
if _tok in ("-X", "--request") and _i + 1 < len(_tokens):
|
| 594 |
+
_method = _tokens[_i + 1].upper()
|
| 595 |
+
break
|
| 596 |
+
if _method == "GET" and any(t in command for t in ("-d ", "--data", "-F ")):
|
| 597 |
+
_method = "POST"
|
| 598 |
+
_norm_path = None
|
| 599 |
+
for _t in _tokens:
|
| 600 |
+
if _t.startswith("http"):
|
| 601 |
+
_norm_path = _normalise_path(urlparse(_t.strip("'\"")).path)
|
| 602 |
break
|
| 603 |
+
if _norm_path:
|
| 604 |
+
_mp = (_method, _norm_path)
|
| 605 |
+
if _mp in self._called_methods_paths:
|
| 606 |
+
reward += PENALTY_REPEATED_PATH
|
| 607 |
+
self._called_methods_paths.append(_mp)
|
| 608 |
+
if _norm_path not in self._called_paths:
|
| 609 |
+
self._called_paths.add(_norm_path)
|
| 610 |
+
reward += REWARD_NEW_PATH
|
| 611 |
except Exception:
|
| 612 |
pass
|
| 613 |
elif 400 <= status < 500:
|
server/tools/browser_agent.py
CHANGED
|
@@ -7,24 +7,17 @@ calls, REST endpoints, form submissions), and builds embeddings via the
|
|
| 7 |
HuggingFace Inference API for semantic search_endpoints().
|
| 8 |
|
| 9 |
Architecture:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
- Embeddings are cached on disk via embed_cache.py (max 2000 entries).
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
zero API cost.
|
| 14 |
-
|
| 15 |
-
- Source priority:
|
| 16 |
-
1. HAR file (primary) — endpoints observed from browser traffic.
|
| 17 |
-
If HAR has < HAR_MIN_ENTRIES meaningful endpoints, it is a partial
|
| 18 |
-
recording and we augment with the API catalog (see below).
|
| 19 |
-
2. API catalog (fallback) — full structured spec extracted from source
|
| 20 |
-
code. Used ONLY when the HAR is sparse. This is equivalent to
|
| 21 |
-
the "live browser session" described in BROWSER_AGENT.md §Stage 2.
|
| 22 |
-
|
| 23 |
-
- The catalog is ALSO used by the judge for parameter-sourcing grading.
|
| 24 |
-
It serves double duty, but the two uses are completely independent:
|
| 25 |
-
the judge compares tool call parameters against catalog ground truth,
|
| 26 |
-
while the agent uses catalog entries as a search corpus when HAR alone
|
| 27 |
-
is insufficient.
|
| 28 |
"""
|
| 29 |
|
| 30 |
from __future__ import annotations
|
|
@@ -43,11 +36,6 @@ import numpy as np
|
|
| 43 |
# ---------------------------------------------------------------------------
|
| 44 |
|
| 45 |
HARS_DIR = Path(__file__).parent.parent.parent / "hars"
|
| 46 |
-
CATALOGS_DIR = Path(__file__).parent.parent.parent / "catalogs"
|
| 47 |
-
|
| 48 |
-
# If a HAR yields fewer than this many unique endpoints it is considered a
|
| 49 |
-
# partial recording and the API catalog is used to fill in the rest.
|
| 50 |
-
HAR_MIN_ENTRIES = 10
|
| 51 |
|
| 52 |
HAR_MAP: dict[str, str] = {
|
| 53 |
":7770": "shopping.har",
|
|
@@ -140,6 +128,11 @@ def _is_api_like(path: str, method: str, resp_ct: str, req_ct: str) -> bool:
|
|
| 140 |
return False
|
| 141 |
|
| 142 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 143 |
def _normalise_path(path: str) -> str:
|
| 144 |
for pattern, replacement in _ID_PATTERNS:
|
| 145 |
path = pattern.sub(replacement, path)
|
|
@@ -195,9 +188,14 @@ def extract_openapi_spec(har_data: dict, app_base_url: str) -> list[dict]:
|
|
| 195 |
"""
|
| 196 |
Extract an OpenAPI-like spec from HAR data.
|
| 197 |
|
| 198 |
-
Includes:
|
|
|
|
|
|
|
|
|
|
| 199 |
Excludes: static assets (JS/CSS/images/fonts), analytics, CDN.
|
| 200 |
"""
|
|
|
|
|
|
|
| 201 |
entries = har_data.get("log", {}).get("entries", [])
|
| 202 |
seen: set[str] = set()
|
| 203 |
spec_entries = []
|
|
@@ -219,7 +217,10 @@ def extract_openapi_spec(har_data: dict, app_base_url: str) -> list[dict]:
|
|
| 219 |
parsed_url = urlparse(raw_url)
|
| 220 |
path = parsed_url.path
|
| 221 |
|
| 222 |
-
|
|
|
|
|
|
|
|
|
|
| 223 |
continue
|
| 224 |
|
| 225 |
path_norm = _normalise_path(path)
|
|
@@ -233,16 +234,56 @@ def extract_openapi_spec(har_data: dict, app_base_url: str) -> list[dict]:
|
|
| 233 |
for h in req.get("headers", [])
|
| 234 |
)
|
| 235 |
|
| 236 |
-
|
| 237 |
-
|
| 238 |
-
"
|
| 239 |
-
|
| 240 |
-
|
| 241 |
-
|
| 242 |
-
|
| 243 |
-
|
| 244 |
-
|
| 245 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 246 |
|
| 247 |
return spec_entries
|
| 248 |
|
|
@@ -255,16 +296,26 @@ def spec_entry_to_text(entry: dict, app_name: str) -> str:
|
|
| 255 |
f"status: {entry['status_code']}",
|
| 256 |
f"auth: {'required' if entry['auth_observed'] else 'none'}",
|
| 257 |
]
|
| 258 |
-
if entry.get("
|
| 259 |
-
parts.append(
|
| 260 |
-
|
| 261 |
-
|
| 262 |
-
|
| 263 |
-
|
| 264 |
-
|
| 265 |
-
|
| 266 |
-
|
| 267 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 268 |
return " | ".join(parts)
|
| 269 |
|
| 270 |
|
|
@@ -385,57 +436,9 @@ def embed_query_via_api(query: str) -> np.ndarray | None:
|
|
| 385 |
return _embed_with_cache([query])
|
| 386 |
|
| 387 |
|
| 388 |
-
def catalog_to_spec_entries(app_name: str) -> list[dict]:
|
| 389 |
-
"""
|
| 390 |
-
Load the API catalog as spec entries.
|
| 391 |
-
|
| 392 |
-
Used ONLY when the HAR yields fewer than HAR_MIN_ENTRIES endpoints
|
| 393 |
-
(i.e. it is a partial/stub recording). This is equivalent to the
|
| 394 |
-
live-browser-session fallback described in BROWSER_AGENT.md §Stage 2.
|
| 395 |
-
|
| 396 |
-
The judge uses the same catalog for parameter-sourcing grading, but
|
| 397 |
-
the two uses are independent — the agent's search corpus and the
|
| 398 |
-
judge's ground-truth are different concepts that happen to share the
|
| 399 |
-
same underlying data file.
|
| 400 |
-
"""
|
| 401 |
-
catalog_path = CATALOGS_DIR / f"{app_name}.json"
|
| 402 |
-
if not catalog_path.exists():
|
| 403 |
-
return []
|
| 404 |
-
try:
|
| 405 |
-
with open(catalog_path) as f:
|
| 406 |
-
data = json.load(f)
|
| 407 |
-
endpoints = data if isinstance(data, list) else data.get("endpoints", [])
|
| 408 |
-
spec_entries = []
|
| 409 |
-
for ep in endpoints:
|
| 410 |
-
endpoint_str = ep.get("endpoint", "")
|
| 411 |
-
if endpoint_str and " " in endpoint_str:
|
| 412 |
-
method, path = endpoint_str.split(" ", 1)
|
| 413 |
-
method = method.upper()
|
| 414 |
-
else:
|
| 415 |
-
path = ep.get("path", endpoint_str)
|
| 416 |
-
method = ep.get("method", "GET").upper()
|
| 417 |
-
if not path:
|
| 418 |
-
continue
|
| 419 |
-
auth = ep.get("auth", ep.get("authentication", "none"))
|
| 420 |
-
spec_entries.append({
|
| 421 |
-
"method": method,
|
| 422 |
-
"path": path,
|
| 423 |
-
"query_params": None,
|
| 424 |
-
"request_body": ep.get("body_params") or ep.get("body"),
|
| 425 |
-
"status_code": 200,
|
| 426 |
-
"response_content_type": "application/json",
|
| 427 |
-
"response_body_sample": ep.get("response_fields") or ep.get("response_sample"),
|
| 428 |
-
"auth_observed": auth not in ("none", "None", None, ""),
|
| 429 |
-
})
|
| 430 |
-
return spec_entries
|
| 431 |
-
except Exception as e:
|
| 432 |
-
print(f"[browser_agent] Could not load catalog '{app_name}': {e}", flush=True)
|
| 433 |
-
return []
|
| 434 |
-
|
| 435 |
-
|
| 436 |
def build_endpoint_embeddings(spec_entries: list[dict], app_name: str):
|
| 437 |
"""
|
| 438 |
-
Build embeddings for
|
| 439 |
Returns (embeddings_array, text_chunks).
|
| 440 |
Embeddings are retrieved from or saved to the persistent cache.
|
| 441 |
"""
|
|
@@ -512,22 +515,6 @@ def run_browser_agent(task: str, url: str, episode_store=None) -> dict:
|
|
| 512 |
flush=True,
|
| 513 |
)
|
| 514 |
|
| 515 |
-
# Augment with catalog when HAR is a partial recording
|
| 516 |
-
# (The catalog = source-code-extracted API spec; serves the same role as a
|
| 517 |
-
# live browser session when no full HAR is available.)
|
| 518 |
-
if len(spec_entries) < HAR_MIN_ENTRIES:
|
| 519 |
-
catalog_entries = catalog_to_spec_entries(app_name)
|
| 520 |
-
if catalog_entries:
|
| 521 |
-
print(
|
| 522 |
-
f"[browser_agent] HAR sparse ({len(spec_entries)} entries < {HAR_MIN_ENTRIES}), "
|
| 523 |
-
f"augmenting from catalog ({len(catalog_entries)} entries)",
|
| 524 |
-
flush=True,
|
| 525 |
-
)
|
| 526 |
-
har_paths = {e["path"] for e in spec_entries}
|
| 527 |
-
for ce in catalog_entries:
|
| 528 |
-
if ce["path"] not in har_paths:
|
| 529 |
-
spec_entries.append(ce)
|
| 530 |
-
|
| 531 |
# Build / retrieve embeddings via cache
|
| 532 |
if spec_entries and episode_store is not None:
|
| 533 |
try:
|
|
@@ -547,14 +534,19 @@ def run_browser_agent(task: str, url: str, episode_store=None) -> dict:
|
|
| 547 |
_store_empty(episode_store, app_name)
|
| 548 |
|
| 549 |
summary = [{"method": e["method"], "path": e["path"]} for e in spec_entries]
|
|
|
|
|
|
|
| 550 |
return {
|
| 551 |
"app": app_name,
|
| 552 |
"endpoints": summary,
|
| 553 |
"total_endpoints": len(summary),
|
|
|
|
|
|
|
| 554 |
"note": (
|
| 555 |
-
f"Discovered {
|
| 556 |
-
"
|
| 557 |
-
"
|
|
|
|
| 558 |
),
|
| 559 |
}
|
| 560 |
|
|
|
|
| 7 |
HuggingFace Inference API for semantic search_endpoints().
|
| 8 |
|
| 9 |
Architecture:
|
| 10 |
+
- The HAR file is the sole source of the agent's API knowledge.
|
| 11 |
+
The agent discovers endpoints only from what was recorded in the HAR.
|
| 12 |
+
If the HAR is sparse, the browser agent recording needs to be improved —
|
| 13 |
+
the product does not patch this by injecting other data sources.
|
| 14 |
+
|
| 15 |
+
- The API catalog (catalogs/*.json) is used exclusively by the judge
|
| 16 |
+
for parameter-sourcing grading. It plays no role in the training loop.
|
| 17 |
+
|
| 18 |
- Embeddings are cached on disk via embed_cache.py (max 2000 entries).
|
| 19 |
+
First run: calls HF Inference API. All subsequent episodes in the same
|
| 20 |
+
training run are pure cache hits — zero API cost.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
"""
|
| 22 |
|
| 23 |
from __future__ import annotations
|
|
|
|
| 36 |
# ---------------------------------------------------------------------------
|
| 37 |
|
| 38 |
HARS_DIR = Path(__file__).parent.parent.parent / "hars"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
HAR_MAP: dict[str, str] = {
|
| 41 |
":7770": "shopping.har",
|
|
|
|
| 128 |
return False
|
| 129 |
|
| 130 |
|
| 131 |
+
def _is_html_page(method: str, resp_ct: str) -> bool:
|
| 132 |
+
"""Return True for HTML GET responses that may contain SSR data."""
|
| 133 |
+
return method == "GET" and "text/html" in resp_ct
|
| 134 |
+
|
| 135 |
+
|
| 136 |
def _normalise_path(path: str) -> str:
|
| 137 |
for pattern, replacement in _ID_PATTERNS:
|
| 138 |
path = pattern.sub(replacement, path)
|
|
|
|
| 188 |
"""
|
| 189 |
Extract an OpenAPI-like spec from HAR data.
|
| 190 |
|
| 191 |
+
Includes:
|
| 192 |
+
- REST calls, XHR/fetch, form POSTs, any JSON-responding GET
|
| 193 |
+
- HTML GET pages that have a non-empty response body (distilled via html_distiller)
|
| 194 |
+
|
| 195 |
Excludes: static assets (JS/CSS/images/fonts), analytics, CDN.
|
| 196 |
"""
|
| 197 |
+
from .html_distiller import distill_html
|
| 198 |
+
|
| 199 |
entries = har_data.get("log", {}).get("entries", [])
|
| 200 |
seen: set[str] = set()
|
| 201 |
spec_entries = []
|
|
|
|
| 217 |
parsed_url = urlparse(raw_url)
|
| 218 |
path = parsed_url.path
|
| 219 |
|
| 220 |
+
is_html = _is_html_page(method, resp_ct)
|
| 221 |
+
is_api = _is_api_like(path, method, resp_ct, req_ct)
|
| 222 |
+
|
| 223 |
+
if not is_api and not is_html:
|
| 224 |
continue
|
| 225 |
|
| 226 |
path_norm = _normalise_path(path)
|
|
|
|
| 234 |
for h in req.get("headers", [])
|
| 235 |
)
|
| 236 |
|
| 237 |
+
if is_html:
|
| 238 |
+
# Attempt to distil the HTML body captured in the HAR
|
| 239 |
+
html_body = entry.get("response", {}).get("content", {}).get("text", "") or ""
|
| 240 |
+
if not html_body:
|
| 241 |
+
# HAR was recorded without "Save response body" — still include the
|
| 242 |
+
# page as a stub so the agent knows the route exists
|
| 243 |
+
distilled = None
|
| 244 |
+
distilled_summary = None
|
| 245 |
+
else:
|
| 246 |
+
distilled = distill_html(html_body, base_url=raw_url)
|
| 247 |
+
# Build a short summary for the spec text (used for embeddings)
|
| 248 |
+
blob_count = len(distilled.get("data_blobs", []))
|
| 249 |
+
form_count = len(distilled.get("forms", []))
|
| 250 |
+
blob_keys = []
|
| 251 |
+
for b in distilled.get("data_blobs", [])[:3]:
|
| 252 |
+
blob_keys.extend(b.get("keys", [])[:5])
|
| 253 |
+
distilled_summary = {
|
| 254 |
+
"page_type": distilled.get("page_type"),
|
| 255 |
+
"title": distilled.get("title"),
|
| 256 |
+
"data_blobs": blob_count,
|
| 257 |
+
"forms": form_count,
|
| 258 |
+
"blob_top_keys": blob_keys[:20],
|
| 259 |
+
"text_preview": (distilled.get("text") or "")[:200],
|
| 260 |
+
}
|
| 261 |
+
|
| 262 |
+
spec_entries.append({
|
| 263 |
+
"method": method,
|
| 264 |
+
"path": path_norm,
|
| 265 |
+
"query_params": parsed_url.query or None,
|
| 266 |
+
"request_body": None,
|
| 267 |
+
"status_code": resp.get("status", 0),
|
| 268 |
+
"response_content_type": resp_ct,
|
| 269 |
+
"response_body_sample": distilled_summary,
|
| 270 |
+
"auth_observed": has_auth,
|
| 271 |
+
"is_html_page": True,
|
| 272 |
+
# Store full distilled dict so the agent can retrieve it via search_endpoints
|
| 273 |
+
"_distilled": distilled,
|
| 274 |
+
})
|
| 275 |
+
else:
|
| 276 |
+
spec_entries.append({
|
| 277 |
+
"method": method,
|
| 278 |
+
"path": path_norm,
|
| 279 |
+
"query_params": parsed_url.query or None,
|
| 280 |
+
"request_body": _extract_body(req),
|
| 281 |
+
"status_code": resp.get("status", 0),
|
| 282 |
+
"response_content_type": resp_ct,
|
| 283 |
+
"response_body_sample": _truncate_response_sample(resp),
|
| 284 |
+
"auth_observed": has_auth,
|
| 285 |
+
"is_html_page": False,
|
| 286 |
+
})
|
| 287 |
|
| 288 |
return spec_entries
|
| 289 |
|
|
|
|
| 296 |
f"status: {entry['status_code']}",
|
| 297 |
f"auth: {'required' if entry['auth_observed'] else 'none'}",
|
| 298 |
]
|
| 299 |
+
if entry.get("is_html_page"):
|
| 300 |
+
parts.append("type: html_page")
|
| 301 |
+
sample = entry.get("response_body_sample") or {}
|
| 302 |
+
if sample.get("title"):
|
| 303 |
+
parts.append(f"title: {sample['title']}")
|
| 304 |
+
if sample.get("blob_top_keys"):
|
| 305 |
+
parts.append(f"data_keys: {' '.join(sample['blob_top_keys'][:15])}")
|
| 306 |
+
if sample.get("text_preview"):
|
| 307 |
+
parts.append(f"text: {sample['text_preview'][:200]}")
|
| 308 |
+
else:
|
| 309 |
+
if entry.get("query_params"):
|
| 310 |
+
parts.append(f"query: {entry['query_params']}")
|
| 311 |
+
if entry.get("request_body"):
|
| 312 |
+
body = entry["request_body"]
|
| 313 |
+
body_str = json.dumps(body)[:_BODY_SAMPLE_CHARS] if not isinstance(body, str) else body[:_BODY_SAMPLE_CHARS]
|
| 314 |
+
parts.append(f"body: {body_str}")
|
| 315 |
+
if entry.get("response_body_sample") is not None:
|
| 316 |
+
rsp = entry["response_body_sample"]
|
| 317 |
+
rsp_str = json.dumps(rsp)[:_BODY_SAMPLE_CHARS] if not isinstance(rsp, str) else str(rsp)[:_BODY_SAMPLE_CHARS]
|
| 318 |
+
parts.append(f"response_sample: {rsp_str}")
|
| 319 |
return " | ".join(parts)
|
| 320 |
|
| 321 |
|
|
|
|
| 436 |
return _embed_with_cache([query])
|
| 437 |
|
| 438 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 439 |
def build_endpoint_embeddings(spec_entries: list[dict], app_name: str):
|
| 440 |
"""
|
| 441 |
+
Build embeddings for HAR-extracted spec entries.
|
| 442 |
Returns (embeddings_array, text_chunks).
|
| 443 |
Embeddings are retrieved from or saved to the persistent cache.
|
| 444 |
"""
|
|
|
|
| 515 |
flush=True,
|
| 516 |
)
|
| 517 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 518 |
# Build / retrieve embeddings via cache
|
| 519 |
if spec_entries and episode_store is not None:
|
| 520 |
try:
|
|
|
|
| 534 |
_store_empty(episode_store, app_name)
|
| 535 |
|
| 536 |
summary = [{"method": e["method"], "path": e["path"]} for e in spec_entries]
|
| 537 |
+
api_count = sum(1 for e in spec_entries if not e.get("is_html_page"))
|
| 538 |
+
html_count = sum(1 for e in spec_entries if e.get("is_html_page"))
|
| 539 |
return {
|
| 540 |
"app": app_name,
|
| 541 |
"endpoints": summary,
|
| 542 |
"total_endpoints": len(summary),
|
| 543 |
+
"api_endpoints": api_count,
|
| 544 |
+
"html_pages": html_count,
|
| 545 |
"note": (
|
| 546 |
+
f"Discovered {api_count} API endpoints and {html_count} HTML page(s) "
|
| 547 |
+
f"from recorded traffic. "
|
| 548 |
+
"Use search_endpoints(query) to get full schema, parameters, auth details, "
|
| 549 |
+
"and page content (for HTML pages: embedded data blobs, forms, CSRF tokens)."
|
| 550 |
),
|
| 551 |
}
|
| 552 |
|
server/tools/curl_exec.py
CHANGED
|
@@ -375,31 +375,49 @@ def curl_exec(command: str, session_state: dict, episode_store: dict,
|
|
| 375 |
except (json.JSONDecodeError, ValueError):
|
| 376 |
parsed_body = body_text
|
| 377 |
|
| 378 |
-
#
|
| 379 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 380 |
|
| 381 |
-
# Index into episode BM25 store
|
| 382 |
_index_into_episode_store(
|
| 383 |
episode_store=episode_store,
|
| 384 |
request_body=parsed["body"],
|
| 385 |
-
response_body=
|
| 386 |
url=parsed["url"],
|
| 387 |
method=parsed["method"],
|
| 388 |
status_code=status_code,
|
| 389 |
)
|
| 390 |
|
| 391 |
-
# Apply smart truncation
|
| 392 |
-
if status_code >= 400:
|
| 393 |
-
# Never truncate errors
|
| 394 |
-
truncated_body = parsed_body
|
| 395 |
-
else:
|
| 396 |
-
body_for_truncation = body_text if isinstance(parsed_body, str) else json.dumps(parsed_body)
|
| 397 |
-
truncated_body = smart_truncate(body_for_truncation, resp_ct)
|
| 398 |
-
|
| 399 |
return {
|
| 400 |
"status_code": status_code,
|
| 401 |
"headers": resp_headers,
|
| 402 |
"body": truncated_body,
|
|
|
|
|
|
|
|
|
|
| 403 |
}
|
| 404 |
|
| 405 |
|
|
@@ -410,10 +428,18 @@ def curl_exec(command: str, session_state: dict, episode_store: dict,
|
|
| 410 |
def _index_into_episode_store(episode_store: dict, request_body: Any,
|
| 411 |
response_body: Any, url: str, method: str,
|
| 412 |
status_code: int) -> None:
|
| 413 |
-
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 414 |
if "bm25_corpus" not in episode_store:
|
| 415 |
episode_store["bm25_corpus"] = []
|
| 416 |
episode_store["bm25_metadata"] = []
|
|
|
|
| 417 |
|
| 418 |
def _to_text(obj: Any) -> str:
|
| 419 |
if obj is None:
|
|
@@ -422,13 +448,24 @@ def _index_into_episode_store(episode_store: dict, request_body: Any,
|
|
| 422 |
return obj
|
| 423 |
return json.dumps(obj)
|
| 424 |
|
| 425 |
-
|
| 426 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 427 |
|
|
|
|
| 428 |
episode_store["bm25_corpus"].append(entry_text)
|
| 429 |
episode_store["bm25_metadata"].append({
|
| 430 |
"url": url,
|
| 431 |
"method": method,
|
| 432 |
"status_code": status_code,
|
| 433 |
-
"response_body": response_body,
|
| 434 |
})
|
|
|
|
|
|
|
|
|
| 375 |
except (json.JSONDecodeError, ValueError):
|
| 376 |
parsed_body = body_text
|
| 377 |
|
| 378 |
+
# Distil HTML responses into structured compact form
|
| 379 |
+
is_html_response = "text/html" in resp_ct
|
| 380 |
+
if is_html_response and isinstance(parsed_body, str) and parsed_body:
|
| 381 |
+
from .html_distiller import distill_html, distill_html_compact
|
| 382 |
+
distilled = distill_html(parsed_body, base_url=parsed["url"])
|
| 383 |
+
# Auto-extract form_key from HTML forms into session_state for reuse
|
| 384 |
+
for form in distilled.get("forms", []):
|
| 385 |
+
fk = form.get("fields", {}).get("form_key")
|
| 386 |
+
if fk and fk != "hidden":
|
| 387 |
+
session_state["form_key"] = fk
|
| 388 |
+
break
|
| 389 |
+
# Store the full distilled dict (not raw HTML) for search_episode_data
|
| 390 |
+
raw_body_for_store = distilled
|
| 391 |
+
# What we return to the agent is the compact text summary
|
| 392 |
+
truncated_body: Any = distill_html_compact(parsed_body, base_url=parsed["url"])
|
| 393 |
+
else:
|
| 394 |
+
raw_body_for_store = parsed_body
|
| 395 |
+
# Extract tokens from body (only for non-HTML responses)
|
| 396 |
+
_extract_tokens_from_body(parsed_body, session_state)
|
| 397 |
+
# Apply smart truncation
|
| 398 |
+
if status_code >= 400:
|
| 399 |
+
truncated_body = parsed_body
|
| 400 |
+
else:
|
| 401 |
+
body_for_truncation = body_text if isinstance(parsed_body, str) else json.dumps(parsed_body)
|
| 402 |
+
truncated_body = smart_truncate(body_for_truncation, resp_ct)
|
| 403 |
|
| 404 |
+
# Index into episode BM25 store
|
| 405 |
_index_into_episode_store(
|
| 406 |
episode_store=episode_store,
|
| 407 |
request_body=parsed["body"],
|
| 408 |
+
response_body=raw_body_for_store,
|
| 409 |
url=parsed["url"],
|
| 410 |
method=parsed["method"],
|
| 411 |
status_code=status_code,
|
| 412 |
)
|
| 413 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 414 |
return {
|
| 415 |
"status_code": status_code,
|
| 416 |
"headers": resp_headers,
|
| 417 |
"body": truncated_body,
|
| 418 |
+
# _judge_body: full structured body for the judge (not shown to the model)
|
| 419 |
+
# For HTML: the distilled dict; for JSON/text: same as body
|
| 420 |
+
"_judge_body": raw_body_for_store,
|
| 421 |
}
|
| 422 |
|
| 423 |
|
|
|
|
| 428 |
def _index_into_episode_store(episode_store: dict, request_body: Any,
|
| 429 |
response_body: Any, url: str, method: str,
|
| 430 |
status_code: int) -> None:
|
| 431 |
+
"""
|
| 432 |
+
Index request/response into the episode store for search_episode_data().
|
| 433 |
+
|
| 434 |
+
Three parallel structures are maintained:
|
| 435 |
+
bm25_corpus — truncated text strings for BM25 / embedding (lean, fast)
|
| 436 |
+
bm25_metadata — url/method/status_code per entry (no body, saves memory)
|
| 437 |
+
episode_raw_bodies — {index: full_untruncated_response_body} for retrieval
|
| 438 |
+
"""
|
| 439 |
if "bm25_corpus" not in episode_store:
|
| 440 |
episode_store["bm25_corpus"] = []
|
| 441 |
episode_store["bm25_metadata"] = []
|
| 442 |
+
episode_store["episode_raw_bodies"] = {}
|
| 443 |
|
| 444 |
def _to_text(obj: Any) -> str:
|
| 445 |
if obj is None:
|
|
|
|
| 448 |
return obj
|
| 449 |
return json.dumps(obj)
|
| 450 |
|
| 451 |
+
# Lean text for BM25 / embedding — cap at 2000 chars so embeddings stay within
|
| 452 |
+
# the model's token limit without losing the key signal (url + first part of body).
|
| 453 |
+
# For distilled HTML (stored as a dict), serialize the distilled form — it's already
|
| 454 |
+
# compact (text content, blob keys, form actions) rather than raw HTML.
|
| 455 |
+
resp_text = _to_text(response_body)
|
| 456 |
+
lean_resp = resp_text[:2000] if len(resp_text) > 2000 else resp_text
|
| 457 |
+
|
| 458 |
+
entry_text = (
|
| 459 |
+
f"url: {url} method: {method} status: {status_code} "
|
| 460 |
+
f"request: {_to_text(request_body)} response: {lean_resp}"
|
| 461 |
+
)
|
| 462 |
|
| 463 |
+
idx = len(episode_store["bm25_corpus"])
|
| 464 |
episode_store["bm25_corpus"].append(entry_text)
|
| 465 |
episode_store["bm25_metadata"].append({
|
| 466 |
"url": url,
|
| 467 |
"method": method,
|
| 468 |
"status_code": status_code,
|
|
|
|
| 469 |
})
|
| 470 |
+
# Store full untruncated body keyed by index — never truncated
|
| 471 |
+
episode_store["episode_raw_bodies"][idx] = response_body
|
server/tools/html_distiller.py
ADDED
|
@@ -0,0 +1,485 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
html_distiller — technology-agnostic HTML distillation for the RL agent.
|
| 3 |
+
|
| 4 |
+
Converts an HTML response body into a compact, structured dict that the agent
|
| 5 |
+
and the embedding index can work with. Raw HTML is never returned as-is —
|
| 6 |
+
it is expensive (200 KB+) and mostly noise (CSS classes, JS bundles, nav chrome).
|
| 7 |
+
|
| 8 |
+
What is extracted (in priority order):
|
| 9 |
+
1. Embedded JSON data blobs — server-injected structured data that is the
|
| 10 |
+
*actual payload* for SSR pages:
|
| 11 |
+
• <script type="application/json"> (Next.js, generic)
|
| 12 |
+
• <script type="text/x-magento-init"> (Magento 2)
|
| 13 |
+
• window.__INITIAL_STATE__ = {...} (Redux-style SSR)
|
| 14 |
+
• window.__NEXT_DATA__ = {...} (Next.js legacy)
|
| 15 |
+
• window.__nuxt__ = {...} / window.__NUXT__ = {} (Nuxt.js)
|
| 16 |
+
• <script id="__NEXT_DATA__"> (Next.js)
|
| 17 |
+
• Any <script> tag containing only valid JSON
|
| 18 |
+
These are technology-specific patterns, but the extraction logic is written
|
| 19 |
+
generically — it looks for the common conventions rather than hardcoding
|
| 20 |
+
Magento. A React/Next.js app will be handled by the same code path.
|
| 21 |
+
|
| 22 |
+
2. HTML forms — discovers new POST endpoints (form.action) and captures
|
| 23 |
+
auth-critical fields (CSRF tokens, hidden inputs).
|
| 24 |
+
|
| 25 |
+
3. Visible text content — the human-readable body after stripping all
|
| 26 |
+
scripts, styles, and nav/header/footer chrome. Capped at MAX_TEXT_CHARS.
|
| 27 |
+
|
| 28 |
+
Output schema (always a dict with the same keys — absent items are None/[]):
|
| 29 |
+
{
|
| 30 |
+
"page_type": str, # "data_page" | "form_page" | "text_page"
|
| 31 |
+
"title": str | None, # <title> text
|
| 32 |
+
"description": str | None, # <meta name="description">
|
| 33 |
+
"data_blobs": [ # extracted JSON payloads
|
| 34 |
+
{"source": str, "data": any, "keys": [str]} # keys = top-level keys
|
| 35 |
+
],
|
| 36 |
+
"forms": [
|
| 37 |
+
{
|
| 38 |
+
"action": str, # endpoint URL (relative or absolute)
|
| 39 |
+
"method": str, # GET | POST
|
| 40 |
+
"fields": { # name → value (includes hidden inputs)
|
| 41 |
+
"field_name": "field_value_or_type"
|
| 42 |
+
}
|
| 43 |
+
}
|
| 44 |
+
],
|
| 45 |
+
"text": str | None, # stripped visible text (capped)
|
| 46 |
+
"raw_truncated": str, # first RAW_PREVIEW_CHARS of raw HTML (fallback)
|
| 47 |
+
}
|
| 48 |
+
|
| 49 |
+
Usage:
|
| 50 |
+
from server.tools.html_distiller import distill_html
|
| 51 |
+
|
| 52 |
+
result = distill_html(html_string, base_url="http://example.com/page")
|
| 53 |
+
# result["data_blobs"] — structured data, e.g. product listings
|
| 54 |
+
# result["forms"] — form actions + CSRF tokens
|
| 55 |
+
# result["text"] — stripped readable text
|
| 56 |
+
"""
|
| 57 |
+
|
| 58 |
+
from __future__ import annotations
|
| 59 |
+
|
| 60 |
+
import json
|
| 61 |
+
import re
|
| 62 |
+
from typing import Any
|
| 63 |
+
from urllib.parse import urljoin
|
| 64 |
+
|
| 65 |
+
try:
|
| 66 |
+
from bs4 import BeautifulSoup
|
| 67 |
+
_BS4_AVAILABLE = True
|
| 68 |
+
except ImportError:
|
| 69 |
+
_BS4_AVAILABLE = False
|
| 70 |
+
|
| 71 |
+
|
| 72 |
+
# ---------------------------------------------------------------------------
|
| 73 |
+
# Constants
|
| 74 |
+
# ---------------------------------------------------------------------------
|
| 75 |
+
|
| 76 |
+
MAX_TEXT_CHARS = 20000 # max chars of stripped visible text to keep
|
| 77 |
+
MAX_BLOB_KEYS = 40 # max top-level keys to surface from a JSON blob
|
| 78 |
+
MAX_BLOB_DEPTH_PREVIEW = 2 # how many levels of nesting to summarise
|
| 79 |
+
RAW_PREVIEW_CHARS = 1000 # fallback raw HTML preview if BS4 unavailable
|
| 80 |
+
MAX_BLOBS = 10 # max embedded JSON blobs to extract
|
| 81 |
+
MAX_FORMS = 5 # max forms to extract
|
| 82 |
+
MAX_ITEMS_IN_ARRAY = 3 # preview items for large arrays in blobs
|
| 83 |
+
|
| 84 |
+
|
| 85 |
+
# ---------------------------------------------------------------------------
|
| 86 |
+
# Public entry point
|
| 87 |
+
# ---------------------------------------------------------------------------
|
| 88 |
+
|
| 89 |
+
def distill_html(html: str, base_url: str = "") -> dict:
|
| 90 |
+
"""
|
| 91 |
+
Distil an HTML page into a structured, compact representation.
|
| 92 |
+
|
| 93 |
+
Args:
|
| 94 |
+
html: Raw HTML string (may be very large).
|
| 95 |
+
base_url: The URL this page was fetched from, used to resolve
|
| 96 |
+
relative URLs in form actions.
|
| 97 |
+
|
| 98 |
+
Returns:
|
| 99 |
+
Distilled dict (see module docstring for schema).
|
| 100 |
+
"""
|
| 101 |
+
if not html:
|
| 102 |
+
return _empty_result()
|
| 103 |
+
|
| 104 |
+
if not _BS4_AVAILABLE:
|
| 105 |
+
return {
|
| 106 |
+
**_empty_result(),
|
| 107 |
+
"raw_truncated": html[:RAW_PREVIEW_CHARS],
|
| 108 |
+
"_note": "beautifulsoup4 not installed; only raw preview returned.",
|
| 109 |
+
}
|
| 110 |
+
|
| 111 |
+
try:
|
| 112 |
+
# lxml is faster and more forgiving than html.parser for large pages
|
| 113 |
+
soup = BeautifulSoup(html, "lxml")
|
| 114 |
+
except Exception:
|
| 115 |
+
soup = BeautifulSoup(html, "html.parser")
|
| 116 |
+
|
| 117 |
+
title = _extract_title(soup)
|
| 118 |
+
description = _extract_meta_description(soup)
|
| 119 |
+
data_blobs = _extract_data_blobs(soup)
|
| 120 |
+
forms = _extract_forms(soup, base_url)
|
| 121 |
+
text = _extract_visible_text(soup)
|
| 122 |
+
|
| 123 |
+
# Determine page_type based on what we found
|
| 124 |
+
if data_blobs:
|
| 125 |
+
page_type = "data_page"
|
| 126 |
+
elif forms:
|
| 127 |
+
page_type = "form_page"
|
| 128 |
+
else:
|
| 129 |
+
page_type = "text_page"
|
| 130 |
+
|
| 131 |
+
return {
|
| 132 |
+
"page_type": page_type,
|
| 133 |
+
"title": title,
|
| 134 |
+
"description": description,
|
| 135 |
+
"data_blobs": data_blobs,
|
| 136 |
+
"forms": forms,
|
| 137 |
+
"text": text,
|
| 138 |
+
"raw_truncated": html[:RAW_PREVIEW_CHARS],
|
| 139 |
+
}
|
| 140 |
+
|
| 141 |
+
|
| 142 |
+
def distill_html_compact(html: str, base_url: str = "") -> str:
|
| 143 |
+
"""
|
| 144 |
+
Return a compact text representation of the distilled HTML,
|
| 145 |
+
suitable for returning to the agent in curl_exec responses.
|
| 146 |
+
|
| 147 |
+
Aims for < 3000 chars while preserving all actionable information.
|
| 148 |
+
"""
|
| 149 |
+
d = distill_html(html, base_url)
|
| 150 |
+
|
| 151 |
+
parts: list[str] = []
|
| 152 |
+
|
| 153 |
+
if d["title"]:
|
| 154 |
+
parts.append(f"[Page: {d['title']}]")
|
| 155 |
+
|
| 156 |
+
if d["description"]:
|
| 157 |
+
parts.append(f"[Description: {d['description']}]")
|
| 158 |
+
|
| 159 |
+
if d["data_blobs"]:
|
| 160 |
+
parts.append(f"[Embedded data — {len(d['data_blobs'])} block(s)]")
|
| 161 |
+
for i, blob in enumerate(d["data_blobs"]):
|
| 162 |
+
src = blob.get("source", "?")
|
| 163 |
+
data = blob.get("data")
|
| 164 |
+
preview = _compact_blob_preview(data)
|
| 165 |
+
parts.append(f" blob[{i}] from <{src}>: {preview}")
|
| 166 |
+
|
| 167 |
+
if d["forms"]:
|
| 168 |
+
parts.append(f"[Forms — {len(d['forms'])} found]")
|
| 169 |
+
for form in d["forms"]:
|
| 170 |
+
action = form["action"] or "(current page)"
|
| 171 |
+
method = form["method"]
|
| 172 |
+
fields = form["fields"]
|
| 173 |
+
# Strip noisy base64-encoded redirect fields; keep actionable fields only
|
| 174 |
+
_SKIP_FIELDS = {"uenc"}
|
| 175 |
+
clean_fields = {k: v for k, v in fields.items() if k not in _SKIP_FIELDS}
|
| 176 |
+
csrf = {k: v for k, v in clean_fields.items()
|
| 177 |
+
if "csrf" in k.lower() or "token" in k.lower()
|
| 178 |
+
or k.startswith("_") or clean_fields.get(k, "") == "hidden"}
|
| 179 |
+
field_summary = ", ".join(f"{k}={repr(v)}" for k, v in list(clean_fields.items())[:6])
|
| 180 |
+
parts.append(f" {method} {action}")
|
| 181 |
+
parts.append(f" fields: {field_summary}")
|
| 182 |
+
if csrf:
|
| 183 |
+
parts.append(f" csrf/hidden: {csrf}")
|
| 184 |
+
|
| 185 |
+
if d["text"]:
|
| 186 |
+
parts.append(f"[Text content]\n{d['text'][:800]}")
|
| 187 |
+
|
| 188 |
+
result = "\n".join(parts)
|
| 189 |
+
if not result:
|
| 190 |
+
# Absolute fallback: raw preview
|
| 191 |
+
return html[:RAW_PREVIEW_CHARS]
|
| 192 |
+
return result
|
| 193 |
+
|
| 194 |
+
|
| 195 |
+
# ---------------------------------------------------------------------------
|
| 196 |
+
# Extraction helpers
|
| 197 |
+
# ---------------------------------------------------------------------------
|
| 198 |
+
|
| 199 |
+
def _extract_title(soup) -> str | None:
|
| 200 |
+
tag = soup.find("title")
|
| 201 |
+
if tag:
|
| 202 |
+
return tag.get_text(strip=True) or None
|
| 203 |
+
return None
|
| 204 |
+
|
| 205 |
+
|
| 206 |
+
def _extract_meta_description(soup) -> str | None:
|
| 207 |
+
tag = soup.find("meta", attrs={"name": "description"})
|
| 208 |
+
if tag and tag.get("content"):
|
| 209 |
+
return tag["content"].strip() or None
|
| 210 |
+
return None
|
| 211 |
+
|
| 212 |
+
|
| 213 |
+
# Patterns for window.X = {...} assignments in inline scripts
|
| 214 |
+
_WINDOW_ASSIGN_RE = re.compile(
|
| 215 |
+
r'window\.__?([A-Za-z0-9_]+)__?\s*=\s*(\{.*?\}|\[.*?\])',
|
| 216 |
+
re.DOTALL,
|
| 217 |
+
)
|
| 218 |
+
|
| 219 |
+
# Known SSR data script types
|
| 220 |
+
_DATA_SCRIPT_TYPES = {
|
| 221 |
+
"application/json",
|
| 222 |
+
"text/x-magento-init",
|
| 223 |
+
"application/ld+json", # structured data / schema.org
|
| 224 |
+
}
|
| 225 |
+
|
| 226 |
+
# Known SSR script IDs
|
| 227 |
+
_DATA_SCRIPT_IDS = {
|
| 228 |
+
"__next_data__",
|
| 229 |
+
"__nuxt__",
|
| 230 |
+
"initial-state",
|
| 231 |
+
"redux-state",
|
| 232 |
+
"app-state",
|
| 233 |
+
"page-data",
|
| 234 |
+
"server-data",
|
| 235 |
+
"bootstrap-data",
|
| 236 |
+
}
|
| 237 |
+
|
| 238 |
+
|
| 239 |
+
def _try_parse_json(text: str) -> tuple[bool, Any]:
|
| 240 |
+
"""Returns (success, parsed_value)."""
|
| 241 |
+
text = text.strip()
|
| 242 |
+
if not text:
|
| 243 |
+
return False, None
|
| 244 |
+
try:
|
| 245 |
+
return True, json.loads(text)
|
| 246 |
+
except (json.JSONDecodeError, ValueError):
|
| 247 |
+
return False, None
|
| 248 |
+
|
| 249 |
+
|
| 250 |
+
def _summarise_json_keys(obj: Any, depth: int = 0) -> list[str]:
|
| 251 |
+
"""Return top-level keys (and one level of nested keys) for a JSON object."""
|
| 252 |
+
if not isinstance(obj, dict):
|
| 253 |
+
if isinstance(obj, list) and obj:
|
| 254 |
+
return _summarise_json_keys(obj[0], depth)
|
| 255 |
+
return []
|
| 256 |
+
keys = list(obj.keys())
|
| 257 |
+
if depth < 1:
|
| 258 |
+
nested = []
|
| 259 |
+
for k in keys[:5]:
|
| 260 |
+
v = obj[k]
|
| 261 |
+
if isinstance(v, dict):
|
| 262 |
+
sub = list(v.keys())[:5]
|
| 263 |
+
nested.append(f"{k}.{{{','.join(sub)}}}")
|
| 264 |
+
elif isinstance(v, list) and v and isinstance(v[0], dict):
|
| 265 |
+
sub = list(v[0].keys())[:4]
|
| 266 |
+
nested.append(f"{k}[].{{{','.join(sub)}}}")
|
| 267 |
+
return keys + nested
|
| 268 |
+
return keys
|
| 269 |
+
|
| 270 |
+
|
| 271 |
+
def _extract_data_blobs(soup) -> list[dict]:
|
| 272 |
+
"""
|
| 273 |
+
Extract all embedded JSON data blobs from <script> tags and window.X = {...} patterns.
|
| 274 |
+
"""
|
| 275 |
+
blobs: list[dict] = []
|
| 276 |
+
seen_sources: set[str] = set()
|
| 277 |
+
|
| 278 |
+
# 1. <script type="..."> tags with known data types
|
| 279 |
+
for script in soup.find_all("script"):
|
| 280 |
+
if len(blobs) >= MAX_BLOBS:
|
| 281 |
+
break
|
| 282 |
+
|
| 283 |
+
script_type = (script.get("type") or "").lower().strip()
|
| 284 |
+
script_id = (script.get("id") or "").lower().strip()
|
| 285 |
+
text = script.string or ""
|
| 286 |
+
|
| 287 |
+
source = None
|
| 288 |
+
if script_type in _DATA_SCRIPT_TYPES:
|
| 289 |
+
source = script_type
|
| 290 |
+
elif script_id in _DATA_SCRIPT_IDS:
|
| 291 |
+
source = f"id={script.get('id')}"
|
| 292 |
+
elif script_type in ("", "text/javascript", "module"):
|
| 293 |
+
# Check for window.X = {...} patterns
|
| 294 |
+
for m in _WINDOW_ASSIGN_RE.finditer(text):
|
| 295 |
+
var_name = f"window.__{m.group(1)}__"
|
| 296 |
+
ok, data = _try_parse_json(m.group(2))
|
| 297 |
+
if ok and isinstance(data, (dict, list)):
|
| 298 |
+
source_key = var_name
|
| 299 |
+
if source_key not in seen_sources:
|
| 300 |
+
seen_sources.add(source_key)
|
| 301 |
+
blobs.append({
|
| 302 |
+
"source": var_name,
|
| 303 |
+
"data": _preview_blob(data),
|
| 304 |
+
"keys": _summarise_json_keys(data)[:MAX_BLOB_KEYS],
|
| 305 |
+
})
|
| 306 |
+
continue # already handled window patterns above
|
| 307 |
+
else:
|
| 308 |
+
continue
|
| 309 |
+
|
| 310 |
+
if not text.strip():
|
| 311 |
+
continue
|
| 312 |
+
|
| 313 |
+
ok, data = _try_parse_json(text)
|
| 314 |
+
if not ok:
|
| 315 |
+
continue
|
| 316 |
+
|
| 317 |
+
# Skip tiny or trivially small blobs (no useful data)
|
| 318 |
+
if isinstance(data, dict) and len(data) <= 1 and not any(
|
| 319 |
+
isinstance(v, (dict, list)) for v in data.values()
|
| 320 |
+
):
|
| 321 |
+
continue
|
| 322 |
+
|
| 323 |
+
source_key = f"{source}:{script_id or 'anon'}"
|
| 324 |
+
if source_key in seen_sources:
|
| 325 |
+
continue
|
| 326 |
+
seen_sources.add(source_key)
|
| 327 |
+
|
| 328 |
+
blobs.append({
|
| 329 |
+
"source": source,
|
| 330 |
+
"data": _preview_blob(data),
|
| 331 |
+
"keys": _summarise_json_keys(data)[:MAX_BLOB_KEYS],
|
| 332 |
+
})
|
| 333 |
+
|
| 334 |
+
return blobs
|
| 335 |
+
|
| 336 |
+
|
| 337 |
+
def _preview_blob(data: Any) -> Any:
|
| 338 |
+
"""
|
| 339 |
+
Return a compact preview of a JSON blob — large arrays are trimmed,
|
| 340 |
+
deeply nested objects are summarised.
|
| 341 |
+
"""
|
| 342 |
+
if isinstance(data, list):
|
| 343 |
+
if len(data) > MAX_ITEMS_IN_ARRAY:
|
| 344 |
+
return {
|
| 345 |
+
"sample": [_preview_blob(item) for item in data[:MAX_ITEMS_IN_ARRAY]],
|
| 346 |
+
"total": len(data),
|
| 347 |
+
"_note": f"{len(data)} items total. Use search_episode_data() for specifics.",
|
| 348 |
+
}
|
| 349 |
+
return [_preview_blob(item) for item in data]
|
| 350 |
+
|
| 351 |
+
if isinstance(data, dict):
|
| 352 |
+
result = {}
|
| 353 |
+
for k, v in list(data.items())[:MAX_BLOB_KEYS]:
|
| 354 |
+
if isinstance(v, list) and len(v) > MAX_ITEMS_IN_ARRAY:
|
| 355 |
+
result[k] = {
|
| 356 |
+
"sample": [_preview_blob(i) for i in v[:MAX_ITEMS_IN_ARRAY]],
|
| 357 |
+
"total": len(v),
|
| 358 |
+
"_note": f"{len(v)} items. Use search_episode_data() for specifics.",
|
| 359 |
+
}
|
| 360 |
+
elif isinstance(v, dict) and len(v) > 30:
|
| 361 |
+
# Only collapse very large dicts — preserve small-to-medium ones fully
|
| 362 |
+
# since they often contain critical IDs (e.g. product option configs)
|
| 363 |
+
result[k] = {
|
| 364 |
+
"_keys": list(v.keys())[:20],
|
| 365 |
+
"_note": "large nested object — call search_episode_data() for full content",
|
| 366 |
+
}
|
| 367 |
+
else:
|
| 368 |
+
result[k] = v
|
| 369 |
+
return result
|
| 370 |
+
|
| 371 |
+
return data
|
| 372 |
+
|
| 373 |
+
|
| 374 |
+
def _extract_forms(soup, base_url: str) -> list[dict]:
|
| 375 |
+
"""
|
| 376 |
+
Extract all forms: action URL, method, and all named fields with their values.
|
| 377 |
+
Hidden inputs (CSRF tokens, form_key, etc.) are included.
|
| 378 |
+
"""
|
| 379 |
+
forms = []
|
| 380 |
+
for form in soup.find_all("form")[:MAX_FORMS]:
|
| 381 |
+
action = form.get("action", "") or ""
|
| 382 |
+
if base_url and action and not action.startswith("http"):
|
| 383 |
+
action = urljoin(base_url, action)
|
| 384 |
+
method = (form.get("method") or "GET").upper()
|
| 385 |
+
|
| 386 |
+
fields: dict[str, str] = {}
|
| 387 |
+
for inp in form.find_all(["input", "select", "textarea"]):
|
| 388 |
+
name = inp.get("name")
|
| 389 |
+
if not name:
|
| 390 |
+
continue
|
| 391 |
+
inp_type = (inp.get("type") or "text").lower()
|
| 392 |
+
value = inp.get("value", "")
|
| 393 |
+
if inp_type == "hidden":
|
| 394 |
+
# Hidden inputs: store actual value (CSRF tokens etc.)
|
| 395 |
+
fields[name] = value
|
| 396 |
+
elif inp_type in ("submit", "button", "reset"):
|
| 397 |
+
continue
|
| 398 |
+
elif inp_type == "checkbox":
|
| 399 |
+
fields[name] = "checkbox"
|
| 400 |
+
elif inp_type == "radio":
|
| 401 |
+
if name not in fields:
|
| 402 |
+
fields[name] = "radio"
|
| 403 |
+
else:
|
| 404 |
+
# text, email, password, number, etc.
|
| 405 |
+
fields[name] = inp_type if not value else value
|
| 406 |
+
|
| 407 |
+
forms.append({
|
| 408 |
+
"action": action,
|
| 409 |
+
"method": method,
|
| 410 |
+
"fields": fields,
|
| 411 |
+
})
|
| 412 |
+
|
| 413 |
+
return forms
|
| 414 |
+
|
| 415 |
+
|
| 416 |
+
# Tags whose text content is irrelevant noise
|
| 417 |
+
_NOISE_TAGS = {
|
| 418 |
+
"script", "style", "noscript", "head", "meta", "link",
|
| 419 |
+
"header", "footer", "nav", "aside",
|
| 420 |
+
"svg", "path", "symbol",
|
| 421 |
+
"[document]",
|
| 422 |
+
}
|
| 423 |
+
|
| 424 |
+
|
| 425 |
+
def _extract_visible_text(soup) -> str | None:
|
| 426 |
+
"""
|
| 427 |
+
Extract visible text content from the page.
|
| 428 |
+
|
| 429 |
+
Strips scripts, styles, navigation, and other noise.
|
| 430 |
+
Returns plain text, capped at MAX_TEXT_CHARS.
|
| 431 |
+
"""
|
| 432 |
+
# Remove noise tags in-place
|
| 433 |
+
for tag in soup.find_all(_NOISE_TAGS):
|
| 434 |
+
tag.decompose()
|
| 435 |
+
|
| 436 |
+
# Get text from what's left — use separator so words don't jam together
|
| 437 |
+
text = soup.get_text(separator=" ", strip=True)
|
| 438 |
+
|
| 439 |
+
# Collapse whitespace
|
| 440 |
+
text = re.sub(r"\s{2,}", " ", text).strip()
|
| 441 |
+
|
| 442 |
+
if not text:
|
| 443 |
+
return None
|
| 444 |
+
|
| 445 |
+
return text[:MAX_TEXT_CHARS]
|
| 446 |
+
|
| 447 |
+
|
| 448 |
+
def _compact_blob_preview(data: Any) -> str:
|
| 449 |
+
"""One-line preview of a JSON blob for the compact text representation."""
|
| 450 |
+
if data is None:
|
| 451 |
+
return "null"
|
| 452 |
+
if isinstance(data, bool):
|
| 453 |
+
return str(data).lower()
|
| 454 |
+
if isinstance(data, (int, float)):
|
| 455 |
+
return str(data)
|
| 456 |
+
if isinstance(data, str):
|
| 457 |
+
return repr(data[:80])
|
| 458 |
+
if isinstance(data, list):
|
| 459 |
+
total = data.get("total") if isinstance(data, dict) else len(data)
|
| 460 |
+
sample = data.get("sample") if isinstance(data, dict) else data[:1]
|
| 461 |
+
if sample:
|
| 462 |
+
first_keys = list(sample[0].keys())[:4] if isinstance(sample[0], dict) else []
|
| 463 |
+
return f"array({total} items), first keys: {first_keys}"
|
| 464 |
+
return f"array({len(data)} items)"
|
| 465 |
+
if isinstance(data, dict):
|
| 466 |
+
# If it has a "total" note it's our preview wrapper
|
| 467 |
+
if "_note" in data and "total" in data:
|
| 468 |
+
sample = data.get("sample", [])
|
| 469 |
+
keys = list(sample[0].keys())[:4] if sample and isinstance(sample[0], dict) else []
|
| 470 |
+
return f"array({data['total']} items), first item keys: {keys}"
|
| 471 |
+
keys = list(data.keys())[:8]
|
| 472 |
+
return f"object({len(data)} keys): {keys}"
|
| 473 |
+
return str(data)[:100]
|
| 474 |
+
|
| 475 |
+
|
| 476 |
+
def _empty_result() -> dict:
|
| 477 |
+
return {
|
| 478 |
+
"page_type": "text_page",
|
| 479 |
+
"title": None,
|
| 480 |
+
"description": None,
|
| 481 |
+
"data_blobs": [],
|
| 482 |
+
"forms": [],
|
| 483 |
+
"text": None,
|
| 484 |
+
"raw_truncated": "",
|
| 485 |
+
}
|
server/tools/search_episode_data.py
CHANGED
|
@@ -1,87 +1,320 @@
|
|
| 1 |
"""
|
| 2 |
-
search_episode_data
|
| 3 |
|
| 4 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
"""
|
| 6 |
|
| 7 |
from __future__ import annotations
|
| 8 |
|
| 9 |
import json
|
|
|
|
| 10 |
import re
|
| 11 |
from typing import Any
|
| 12 |
|
| 13 |
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
-
|
| 23 |
-
|
| 24 |
"""
|
| 25 |
-
|
| 26 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
-
if not corpus:
|
| 29 |
-
return [{"note": "No episode data yet. Make API calls with curl_exec() first."}]
|
| 30 |
|
| 31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
try:
|
| 33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
-
|
| 36 |
-
tokenized_query = _tokenize(query)
|
| 37 |
-
bm25 = BM25Okapi(tokenized_corpus)
|
| 38 |
-
scores = bm25.get_scores(tokenized_query)
|
| 39 |
|
| 40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
import numpy as np
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 61 |
except ImportError:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
pass
|
| 63 |
-
except Exception as e:
|
| 64 |
-
print(f"[search_episode_data] BM25 error: {e}", flush=True)
|
| 65 |
|
| 66 |
-
#
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
if any(term in doc.lower() for term in query_terms):
|
| 72 |
-
meta = metadata[idx]
|
| 73 |
-
results.append({
|
| 74 |
-
"step": idx + 1,
|
| 75 |
-
"url": meta.get("url", ""),
|
| 76 |
-
"method": meta.get("method", ""),
|
| 77 |
-
"status_code": meta.get("status_code", 0),
|
| 78 |
-
"data": meta.get("response_body"),
|
| 79 |
-
})
|
| 80 |
-
return results[:5] if results else [{"note": f"No results found for: {query}"}]
|
| 81 |
|
| 82 |
|
| 83 |
def _tokenize(text: str) -> list[str]:
|
| 84 |
-
"""Simple whitespace + punctuation tokenizer for BM25."""
|
| 85 |
text = text.lower()
|
| 86 |
tokens = re.findall(r"[a-z0-9_\-\.]+", text)
|
| 87 |
return tokens if tokens else [""]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
"""
|
| 2 |
+
search_episode_data — semantic + BM25 search over accumulated episode API responses.
|
| 3 |
|
| 4 |
+
Each curl_exec call stores its full, untruncated response body in episode_store under
|
| 5 |
+
``episode_raw_bodies``. This tool embeds those bodies (via the same HF API used by
|
| 6 |
+
browser_agent) and performs cosine-similarity search against the model's query, falling
|
| 7 |
+
back to BM25 keyword search when embeddings are unavailable.
|
| 8 |
+
|
| 9 |
+
Results are returned as compact previews so they fit in the LLM context window:
|
| 10 |
+
- Nested trees (e.g. category trees with children_data) are flattened to id+name pairs.
|
| 11 |
+
- Large item arrays are shown as a short sample with a total-count note.
|
| 12 |
+
- The model can issue more specific queries to drill into any result.
|
| 13 |
"""
|
| 14 |
|
| 15 |
from __future__ import annotations
|
| 16 |
|
| 17 |
import json
|
| 18 |
+
import os
|
| 19 |
import re
|
| 20 |
from typing import Any
|
| 21 |
|
| 22 |
|
| 23 |
+
# ---------------------------------------------------------------------------
|
| 24 |
+
# Compact preview helpers
|
| 25 |
+
# ---------------------------------------------------------------------------
|
| 26 |
|
| 27 |
+
def _flatten_tree(obj: Any, id_key: str = "id", name_key: str = "name") -> list[dict]:
|
| 28 |
+
"""Recursively flatten any nested tree structure into [{id, name}] pairs."""
|
| 29 |
+
results: list[dict] = []
|
| 30 |
+
if isinstance(obj, dict):
|
| 31 |
+
if id_key in obj and name_key in obj:
|
| 32 |
+
results.append({id_key: obj[id_key], name_key: obj[name_key]})
|
| 33 |
+
for v in obj.values():
|
| 34 |
+
results.extend(_flatten_tree(v, id_key, name_key))
|
| 35 |
+
elif isinstance(obj, list):
|
| 36 |
+
for item in obj:
|
| 37 |
+
results.extend(_flatten_tree(item, id_key, name_key))
|
| 38 |
+
return results
|
| 39 |
|
| 40 |
+
|
| 41 |
+
def _compact_preview(response_body: Any, max_items: int = 3) -> dict:
|
| 42 |
"""
|
| 43 |
+
Return a compact, context-friendly preview of a response body.
|
| 44 |
+
|
| 45 |
+
- Distilled HTML (has page_type key) → structured summary with forms/products.
|
| 46 |
+
- Nested trees with children_data → flat {id, name} list.
|
| 47 |
+
- Lists / items arrays → short sample + total count.
|
| 48 |
+
- Scalars / errors → returned as-is.
|
| 49 |
+
- The preview always includes a note showing how many objects exist in total.
|
| 50 |
+
"""
|
| 51 |
+
if not isinstance(response_body, (dict, list)):
|
| 52 |
+
return {"value": response_body}
|
| 53 |
+
|
| 54 |
+
# --- distilled HTML page (from html_distiller) ---
|
| 55 |
+
if isinstance(response_body, dict) and "page_type" in response_body and "forms" in response_body:
|
| 56 |
+
result: dict = {}
|
| 57 |
+
if response_body.get("title"):
|
| 58 |
+
result["page_title"] = response_body["title"]
|
| 59 |
+
# Forms — most actionable: show action URL, method, and fields (strip base64 uenc)
|
| 60 |
+
forms = response_body.get("forms", [])
|
| 61 |
+
if forms:
|
| 62 |
+
clean_forms = []
|
| 63 |
+
for form in forms[:8]:
|
| 64 |
+
fields = {k: v for k, v in form.get("fields", {}).items()
|
| 65 |
+
if k not in ("uenc",) and len(str(v)) < 100}
|
| 66 |
+
clean_forms.append({
|
| 67 |
+
"action": form.get("action", ""),
|
| 68 |
+
"method": form.get("method", "GET"),
|
| 69 |
+
"fields": fields,
|
| 70 |
+
})
|
| 71 |
+
result["forms"] = clean_forms
|
| 72 |
+
# Data blobs — show top-level keys and compact preview of small blobs
|
| 73 |
+
blobs = response_body.get("data_blobs", [])
|
| 74 |
+
if blobs:
|
| 75 |
+
blob_summary = []
|
| 76 |
+
for blob in blobs[:3]:
|
| 77 |
+
data = blob.get("data")
|
| 78 |
+
if isinstance(data, (dict, list)):
|
| 79 |
+
s = json.dumps(data)
|
| 80 |
+
blob_summary.append({"source": blob.get("source"), "preview": s[:300]})
|
| 81 |
+
else:
|
| 82 |
+
blob_summary.append({"source": blob.get("source"), "keys": blob.get("keys", [])})
|
| 83 |
+
result["data_blobs"] = blob_summary
|
| 84 |
+
# Visible text — first 600 chars
|
| 85 |
+
text = response_body.get("text")
|
| 86 |
+
if text:
|
| 87 |
+
result["page_text"] = text[:600]
|
| 88 |
+
return result
|
| 89 |
+
|
| 90 |
+
# --- nested tree (e.g. category tree) ---
|
| 91 |
+
if isinstance(response_body, dict) and "children_data" in response_body:
|
| 92 |
+
flat = _flatten_tree(response_body)
|
| 93 |
+
sample = flat[:max_items]
|
| 94 |
+
note = (
|
| 95 |
+
f"Flattened tree — {len(flat)} total entries. "
|
| 96 |
+
f"Showing first {len(sample)}. "
|
| 97 |
+
"Use search_episode_data with a more specific name/id query to find a particular entry."
|
| 98 |
+
)
|
| 99 |
+
return {"entries_sample": sample, "total": len(flat), "note": note}
|
| 100 |
+
|
| 101 |
+
# --- top-level list ---
|
| 102 |
+
if isinstance(response_body, list):
|
| 103 |
+
total = len(response_body)
|
| 104 |
+
sample = [_pick_key_fields(i) for i in response_body[:max_items]]
|
| 105 |
+
note = (
|
| 106 |
+
f"{total} item(s) total. Showing first {len(sample)}. "
|
| 107 |
+
"Refine your search_episode_data query to find a specific item."
|
| 108 |
+
) if total > max_items else f"{total} item(s)."
|
| 109 |
+
return {"items_sample": sample, "total": total, "note": note}
|
| 110 |
+
|
| 111 |
+
# --- dict with an "items" array (common paginated response) ---
|
| 112 |
+
if isinstance(response_body, dict) and "items" in response_body:
|
| 113 |
+
items = response_body.get("items", [])
|
| 114 |
+
total = response_body.get("total_count", len(items))
|
| 115 |
+
sample = [_pick_key_fields(i) for i in items[:max_items]]
|
| 116 |
+
note = (
|
| 117 |
+
f"{total} item(s) total. Showing first {len(sample)}. "
|
| 118 |
+
"Refine your search_episode_data query to find a specific item."
|
| 119 |
+
) if len(items) > max_items else f"{len(items)} item(s)."
|
| 120 |
+
result = dict(response_body)
|
| 121 |
+
result["items"] = sample
|
| 122 |
+
result["_preview_note"] = note
|
| 123 |
+
result["total_count"] = total
|
| 124 |
+
return result
|
| 125 |
+
|
| 126 |
+
# --- plain dict — return as-is (usually already small) ---
|
| 127 |
+
return response_body
|
| 128 |
+
|
| 129 |
+
|
| 130 |
+
def _pick_key_fields(item: Any) -> Any:
|
| 131 |
+
"""For list items, keep only the most useful fields to reduce context size."""
|
| 132 |
+
if not isinstance(item, dict):
|
| 133 |
+
return item
|
| 134 |
+
KEEP = {"id", "sku", "name", "price", "category_id", "title", "slug",
|
| 135 |
+
"item_id", "quote_id", "qty", "status", "order_id", "email",
|
| 136 |
+
"username", "token", "cartId", "cart_id"}
|
| 137 |
+
kept = {k: v for k, v in item.items() if k in KEEP}
|
| 138 |
+
return kept if kept else item # fallback: return full item if no key fields match
|
| 139 |
|
|
|
|
|
|
|
| 140 |
|
| 141 |
+
# ---------------------------------------------------------------------------
|
| 142 |
+
# Text representation for embedding / BM25
|
| 143 |
+
# ---------------------------------------------------------------------------
|
| 144 |
+
|
| 145 |
+
def _body_to_search_text(url: str, method: str, status_code: int,
|
| 146 |
+
response_body: Any) -> str:
|
| 147 |
+
"""
|
| 148 |
+
Produce a searchable text string that represents a stored API response.
|
| 149 |
+
We embed this text so the model can find responses by semantic query.
|
| 150 |
+
The full body is stored separately (in episode_raw_bodies) for retrieval.
|
| 151 |
+
"""
|
| 152 |
try:
|
| 153 |
+
body_str = json.dumps(response_body) if not isinstance(response_body, str) else response_body
|
| 154 |
+
except Exception:
|
| 155 |
+
body_str = str(response_body)
|
| 156 |
+
|
| 157 |
+
# Truncate for embedding (model has 512-token limit; 2000 chars is ~400 tokens)
|
| 158 |
+
if len(body_str) > 2000:
|
| 159 |
+
body_str = body_str[:2000]
|
| 160 |
|
| 161 |
+
return f"url: {url} method: {method} status: {status_code} response: {body_str}"
|
|
|
|
|
|
|
|
|
|
| 162 |
|
| 163 |
+
|
| 164 |
+
# ---------------------------------------------------------------------------
|
| 165 |
+
# Semantic embedding search
|
| 166 |
+
# ---------------------------------------------------------------------------
|
| 167 |
+
|
| 168 |
+
def _get_episode_embeddings(episode_store: dict) -> tuple[Any, list[str]] | None:
|
| 169 |
+
"""
|
| 170 |
+
Build or retrieve embeddings for all stored episode responses.
|
| 171 |
+
|
| 172 |
+
Returns (embeddings_array, text_list) or None if embeddings unavailable.
|
| 173 |
+
Embeddings are cached in episode_store["response_embeddings"] after first build.
|
| 174 |
+
New responses added since last build are embedded incrementally.
|
| 175 |
+
"""
|
| 176 |
+
try:
|
| 177 |
import numpy as np
|
| 178 |
+
from .browser_agent import _embed_with_cache
|
| 179 |
+
except ImportError:
|
| 180 |
+
return None
|
| 181 |
+
|
| 182 |
+
texts: list[str] = episode_store.get("bm25_corpus", [])
|
| 183 |
+
if not texts:
|
| 184 |
+
return None
|
| 185 |
+
|
| 186 |
+
cached_embs = episode_store.get("response_embeddings")
|
| 187 |
+
cached_count = len(cached_embs) if cached_embs is not None else 0
|
| 188 |
+
|
| 189 |
+
if cached_count == len(texts):
|
| 190 |
+
# All texts already embedded
|
| 191 |
+
return cached_embs, texts
|
| 192 |
+
|
| 193 |
+
# Embed any new texts added since last call
|
| 194 |
+
new_texts = texts[cached_count:]
|
| 195 |
+
new_embs = _embed_with_cache(new_texts)
|
| 196 |
+
if new_embs is None:
|
| 197 |
+
return None
|
| 198 |
+
|
| 199 |
+
if cached_embs is not None and len(cached_embs) > 0:
|
| 200 |
+
combined = np.vstack([cached_embs, new_embs])
|
| 201 |
+
else:
|
| 202 |
+
combined = new_embs
|
| 203 |
+
|
| 204 |
+
episode_store["response_embeddings"] = combined
|
| 205 |
+
return combined, texts
|
| 206 |
+
|
| 207 |
|
| 208 |
+
def _semantic_search(query: str, episode_store: dict,
|
| 209 |
+
top_k: int = 5) -> list[int] | None:
|
| 210 |
+
"""
|
| 211 |
+
Return top_k indices ranked by cosine similarity to the query.
|
| 212 |
+
Returns None if embeddings are unavailable (fall back to BM25).
|
| 213 |
+
"""
|
| 214 |
+
try:
|
| 215 |
+
import numpy as np
|
| 216 |
+
from .browser_agent import _embed_with_cache
|
| 217 |
except ImportError:
|
| 218 |
+
return None
|
| 219 |
+
|
| 220 |
+
result = _get_episode_embeddings(episode_store)
|
| 221 |
+
if result is None:
|
| 222 |
+
return None
|
| 223 |
+
|
| 224 |
+
embs, _ = result
|
| 225 |
+
query_emb = _embed_with_cache([query])
|
| 226 |
+
if query_emb is None:
|
| 227 |
+
return None
|
| 228 |
+
|
| 229 |
+
scores = embs @ query_emb[0] # dot product = cosine sim (both L2-normalised)
|
| 230 |
+
top_k = min(top_k, len(scores))
|
| 231 |
+
return sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)[:top_k]
|
| 232 |
+
|
| 233 |
+
|
| 234 |
+
# ---------------------------------------------------------------------------
|
| 235 |
+
# BM25 fallback
|
| 236 |
+
# ---------------------------------------------------------------------------
|
| 237 |
+
|
| 238 |
+
def _bm25_search(query: str, corpus: list[str], top_k: int = 5) -> list[int]:
|
| 239 |
+
"""Return top_k indices by BM25 score, or keyword-match fallback."""
|
| 240 |
+
try:
|
| 241 |
+
from rank_bm25 import BM25Okapi
|
| 242 |
+
import numpy as np
|
| 243 |
+
|
| 244 |
+
tokenized = [_tokenize(doc) for doc in corpus]
|
| 245 |
+
bm25 = BM25Okapi(tokenized)
|
| 246 |
+
scores = bm25.get_scores(_tokenize(query))
|
| 247 |
+
top = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)
|
| 248 |
+
return [i for i in top[:top_k] if scores[i] > 0]
|
| 249 |
+
except Exception:
|
| 250 |
pass
|
|
|
|
|
|
|
| 251 |
|
| 252 |
+
# Keyword fallback
|
| 253 |
+
q_lower = query.lower()
|
| 254 |
+
terms = q_lower.split()
|
| 255 |
+
hits = [i for i, doc in enumerate(corpus) if any(t in doc.lower() for t in terms)]
|
| 256 |
+
return hits[:top_k]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 257 |
|
| 258 |
|
| 259 |
def _tokenize(text: str) -> list[str]:
|
|
|
|
| 260 |
text = text.lower()
|
| 261 |
tokens = re.findall(r"[a-z0-9_\-\.]+", text)
|
| 262 |
return tokens if tokens else [""]
|
| 263 |
+
|
| 264 |
+
|
| 265 |
+
# ---------------------------------------------------------------------------
|
| 266 |
+
# Public API
|
| 267 |
+
# ---------------------------------------------------------------------------
|
| 268 |
+
|
| 269 |
+
def search_episode_data(query: str, episode_store: dict) -> list[dict]:
|
| 270 |
+
"""
|
| 271 |
+
Semantic + BM25 search over all API responses collected during this episode.
|
| 272 |
+
|
| 273 |
+
Each response is stored in full (untruncated) in the episode store.
|
| 274 |
+
Results are returned as compact previews so they fit the LLM context window:
|
| 275 |
+
- Nested trees are flattened to {id, name} pairs with a total-count note.
|
| 276 |
+
- Large arrays show a short sample with a note like "47 items total".
|
| 277 |
+
- Use more specific queries to drill into a particular response.
|
| 278 |
+
|
| 279 |
+
Args:
|
| 280 |
+
query: Natural language or keyword query (e.g. "category id for Pants",
|
| 281 |
+
"cart id", "SKU for Radiant Tee", "_csrf_token").
|
| 282 |
+
episode_store: Per-episode mutable store populated by curl_exec.
|
| 283 |
+
|
| 284 |
+
Returns:
|
| 285 |
+
List of up to 5 matching results, each with:
|
| 286 |
+
step, url, method, status_code, data (compact preview).
|
| 287 |
+
"""
|
| 288 |
+
corpus: list[str] = episode_store.get("bm25_corpus", [])
|
| 289 |
+
metadata: list[dict] = episode_store.get("bm25_metadata", [])
|
| 290 |
+
|
| 291 |
+
if not corpus:
|
| 292 |
+
return [{"note": "No episode data yet. Make API calls with curl_exec() first."}]
|
| 293 |
+
|
| 294 |
+
# Try semantic search first
|
| 295 |
+
indices = _semantic_search(query, episode_store, top_k=5)
|
| 296 |
+
|
| 297 |
+
# Fall back to BM25 if semantic unavailable
|
| 298 |
+
if indices is None:
|
| 299 |
+
indices = _bm25_search(query, corpus, top_k=5)
|
| 300 |
+
|
| 301 |
+
if not indices:
|
| 302 |
+
return [{"note": f"No results found for: {query!r}. "
|
| 303 |
+
"Try a different query or check your curl_exec call history."}]
|
| 304 |
+
|
| 305 |
+
results = []
|
| 306 |
+
for idx in indices:
|
| 307 |
+
if idx >= len(metadata):
|
| 308 |
+
continue
|
| 309 |
+
meta = metadata[idx]
|
| 310 |
+
# Full untruncated body is in episode_raw_bodies; metadata holds it too
|
| 311 |
+
raw_body = episode_store.get("episode_raw_bodies", {}).get(idx, meta.get("response_body"))
|
| 312 |
+
results.append({
|
| 313 |
+
"step": idx + 1,
|
| 314 |
+
"url": meta.get("url", ""),
|
| 315 |
+
"method": meta.get("method", ""),
|
| 316 |
+
"status_code": meta.get("status_code", 0),
|
| 317 |
+
"data": _compact_preview(raw_body),
|
| 318 |
+
})
|
| 319 |
+
|
| 320 |
+
return results
|
uv.lock
CHANGED
|
@@ -171,6 +171,19 @@ wheels = [
|
|
| 171 |
{ url = "https://files.pythonhosted.org/packages/71/cc/18245721fa7747065ab478316c7fea7c74777d07f37ae60db2e84f8172e8/beartype-0.22.9-py3-none-any.whl", hash = "sha256:d16c9bbc61ea14637596c5f6fbff2ee99cbe3573e46a716401734ef50c3060c2", size = 1333658, upload-time = "2025-12-13T06:50:28.266Z" },
|
| 172 |
]
|
| 173 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 174 |
[[package]]
|
| 175 |
name = "brotli"
|
| 176 |
version = "1.2.0"
|
|
@@ -1328,6 +1341,130 @@ wheels = [
|
|
| 1328 |
{ url = "https://files.pythonhosted.org/packages/81/db/e655086b7f3a705df045bf0933bdd9c2f79bb3c97bfef1384598bb79a217/keyring-25.7.0-py3-none-any.whl", hash = "sha256:be4a0b195f149690c166e850609a477c532ddbfbaed96a404d4e43f8d5e2689f", size = 39160, upload-time = "2025-11-16T16:26:08.402Z" },
|
| 1329 |
]
|
| 1330 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1331 |
[[package]]
|
| 1332 |
name = "markdown-it-py"
|
| 1333 |
version = "4.0.0"
|
|
@@ -1884,7 +2021,9 @@ name = "openenv-harvestgym"
|
|
| 1884 |
version = "0.1.0"
|
| 1885 |
source = { editable = "." }
|
| 1886 |
dependencies = [
|
|
|
|
| 1887 |
{ name = "fastapi" },
|
|
|
|
| 1888 |
{ name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
|
| 1889 |
{ name = "numpy", version = "2.4.4", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
|
| 1890 |
{ name = "openai" },
|
|
@@ -1907,7 +2046,9 @@ embeddings = [
|
|
| 1907 |
|
| 1908 |
[package.metadata]
|
| 1909 |
requires-dist = [
|
|
|
|
| 1910 |
{ name = "fastapi", specifier = ">=0.100.0" },
|
|
|
|
| 1911 |
{ name = "numpy", specifier = ">=1.24.0" },
|
| 1912 |
{ name = "openai", specifier = ">=1.0.0" },
|
| 1913 |
{ name = "openenv-core", extras = ["core"], specifier = ">=0.2.2" },
|
|
@@ -3373,6 +3514,15 @@ wheels = [
|
|
| 3373 |
{ url = "https://files.pythonhosted.org/packages/e9/44/75a9c9421471a6c4805dbf2356f7c181a29c1879239abab1ea2cc8f38b40/sniffio-1.3.1-py3-none-any.whl", hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2", size = 10235, upload-time = "2024-02-25T23:20:01.196Z" },
|
| 3374 |
]
|
| 3375 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3376 |
[[package]]
|
| 3377 |
name = "sse-starlette"
|
| 3378 |
version = "3.3.4"
|
|
|
|
| 171 |
{ url = "https://files.pythonhosted.org/packages/71/cc/18245721fa7747065ab478316c7fea7c74777d07f37ae60db2e84f8172e8/beartype-0.22.9-py3-none-any.whl", hash = "sha256:d16c9bbc61ea14637596c5f6fbff2ee99cbe3573e46a716401734ef50c3060c2", size = 1333658, upload-time = "2025-12-13T06:50:28.266Z" },
|
| 172 |
]
|
| 173 |
|
| 174 |
+
[[package]]
|
| 175 |
+
name = "beautifulsoup4"
|
| 176 |
+
version = "4.14.3"
|
| 177 |
+
source = { registry = "https://pypi.org/simple" }
|
| 178 |
+
dependencies = [
|
| 179 |
+
{ name = "soupsieve" },
|
| 180 |
+
{ name = "typing-extensions" },
|
| 181 |
+
]
|
| 182 |
+
sdist = { url = "https://files.pythonhosted.org/packages/c3/b0/1c6a16426d389813b48d95e26898aff79abbde42ad353958ad95cc8c9b21/beautifulsoup4-4.14.3.tar.gz", hash = "sha256:6292b1c5186d356bba669ef9f7f051757099565ad9ada5dd630bd9de5fa7fb86", size = 627737, upload-time = "2025-11-30T15:08:26.084Z" }
|
| 183 |
+
wheels = [
|
| 184 |
+
{ url = "https://files.pythonhosted.org/packages/1a/39/47f9197bdd44df24d67ac8893641e16f386c984a0619ef2ee4c51fbbc019/beautifulsoup4-4.14.3-py3-none-any.whl", hash = "sha256:0918bfe44902e6ad8d57732ba310582e98da931428d231a5ecb9e7c703a735bb", size = 107721, upload-time = "2025-11-30T15:08:24.087Z" },
|
| 185 |
+
]
|
| 186 |
+
|
| 187 |
[[package]]
|
| 188 |
name = "brotli"
|
| 189 |
version = "1.2.0"
|
|
|
|
| 1341 |
{ url = "https://files.pythonhosted.org/packages/81/db/e655086b7f3a705df045bf0933bdd9c2f79bb3c97bfef1384598bb79a217/keyring-25.7.0-py3-none-any.whl", hash = "sha256:be4a0b195f149690c166e850609a477c532ddbfbaed96a404d4e43f8d5e2689f", size = 39160, upload-time = "2025-11-16T16:26:08.402Z" },
|
| 1342 |
]
|
| 1343 |
|
| 1344 |
+
[[package]]
|
| 1345 |
+
name = "lxml"
|
| 1346 |
+
version = "6.0.2"
|
| 1347 |
+
source = { registry = "https://pypi.org/simple" }
|
| 1348 |
+
sdist = { url = "https://files.pythonhosted.org/packages/aa/88/262177de60548e5a2bfc46ad28232c9e9cbde697bd94132aeb80364675cb/lxml-6.0.2.tar.gz", hash = "sha256:cd79f3367bd74b317dda655dc8fcfa304d9eb6e4fb06b7168c5cf27f96e0cd62", size = 4073426, upload-time = "2025-09-22T04:04:59.287Z" }
|
| 1349 |
+
wheels = [
|
| 1350 |
+
{ url = "https://files.pythonhosted.org/packages/db/8a/f8192a08237ef2fb1b19733f709db88a4c43bc8ab8357f01cb41a27e7f6a/lxml-6.0.2-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:e77dd455b9a16bbd2a5036a63ddbd479c19572af81b624e79ef422f929eef388", size = 8590589, upload-time = "2025-09-22T04:00:10.51Z" },
|
| 1351 |
+
{ url = "https://files.pythonhosted.org/packages/12/64/27bcd07ae17ff5e5536e8d88f4c7d581b48963817a13de11f3ac3329bfa2/lxml-6.0.2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:5d444858b9f07cefff6455b983aea9a67f7462ba1f6cbe4a21e8bf6791bf2153", size = 4629671, upload-time = "2025-09-22T04:00:15.411Z" },
|
| 1352 |
+
{ url = "https://files.pythonhosted.org/packages/02/5a/a7d53b3291c324e0b6e48f3c797be63836cc52156ddf8f33cd72aac78866/lxml-6.0.2-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:f952dacaa552f3bb8834908dddd500ba7d508e6ea6eb8c52eb2d28f48ca06a31", size = 4999961, upload-time = "2025-09-22T04:00:17.619Z" },
|
| 1353 |
+
{ url = "https://files.pythonhosted.org/packages/f5/55/d465e9b89df1761674d8672bb3e4ae2c47033b01ec243964b6e334c6743f/lxml-6.0.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:71695772df6acea9f3c0e59e44ba8ac50c4f125217e84aab21074a1a55e7e5c9", size = 5157087, upload-time = "2025-09-22T04:00:19.868Z" },
|
| 1354 |
+
{ url = "https://files.pythonhosted.org/packages/62/38/3073cd7e3e8dfc3ba3c3a139e33bee3a82de2bfb0925714351ad3d255c13/lxml-6.0.2-cp310-cp310-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:17f68764f35fd78d7c4cc4ef209a184c38b65440378013d24b8aecd327c3e0c8", size = 5067620, upload-time = "2025-09-22T04:00:21.877Z" },
|
| 1355 |
+
{ url = "https://files.pythonhosted.org/packages/4a/d3/1e001588c5e2205637b08985597827d3827dbaaece16348c8822bfe61c29/lxml-6.0.2-cp310-cp310-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:058027e261afed589eddcfe530fcc6f3402d7fd7e89bfd0532df82ebc1563dba", size = 5406664, upload-time = "2025-09-22T04:00:23.714Z" },
|
| 1356 |
+
{ url = "https://files.pythonhosted.org/packages/20/cf/cab09478699b003857ed6ebfe95e9fb9fa3d3c25f1353b905c9b73cfb624/lxml-6.0.2-cp310-cp310-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a8ffaeec5dfea5881d4c9d8913a32d10cfe3923495386106e4a24d45300ef79c", size = 5289397, upload-time = "2025-09-22T04:00:25.544Z" },
|
| 1357 |
+
{ url = "https://files.pythonhosted.org/packages/a3/84/02a2d0c38ac9a8b9f9e5e1bbd3f24b3f426044ad618b552e9549ee91bd63/lxml-6.0.2-cp310-cp310-manylinux_2_31_armv7l.whl", hash = "sha256:f2e3b1a6bb38de0bc713edd4d612969dd250ca8b724be8d460001a387507021c", size = 4772178, upload-time = "2025-09-22T04:00:27.602Z" },
|
| 1358 |
+
{ url = "https://files.pythonhosted.org/packages/56/87/e1ceadcc031ec4aa605fe95476892d0b0ba3b7f8c7dcdf88fdeff59a9c86/lxml-6.0.2-cp310-cp310-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:d6690ec5ec1cce0385cb20896b16be35247ac8c2046e493d03232f1c2414d321", size = 5358148, upload-time = "2025-09-22T04:00:29.323Z" },
|
| 1359 |
+
{ url = "https://files.pythonhosted.org/packages/fe/13/5bb6cf42bb228353fd4ac5f162c6a84fd68a4d6f67c1031c8cf97e131fc6/lxml-6.0.2-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:f2a50c3c1d11cad0ebebbac357a97b26aa79d2bcaf46f256551152aa85d3a4d1", size = 5112035, upload-time = "2025-09-22T04:00:31.061Z" },
|
| 1360 |
+
{ url = "https://files.pythonhosted.org/packages/e4/e2/ea0498552102e59834e297c5c6dff8d8ded3db72ed5e8aad77871476f073/lxml-6.0.2-cp310-cp310-musllinux_1_2_armv7l.whl", hash = "sha256:3efe1b21c7801ffa29a1112fab3b0f643628c30472d507f39544fd48e9549e34", size = 4799111, upload-time = "2025-09-22T04:00:33.11Z" },
|
| 1361 |
+
{ url = "https://files.pythonhosted.org/packages/6a/9e/8de42b52a73abb8af86c66c969b3b4c2a96567b6ac74637c037d2e3baa60/lxml-6.0.2-cp310-cp310-musllinux_1_2_riscv64.whl", hash = "sha256:59c45e125140b2c4b33920d21d83681940ca29f0b83f8629ea1a2196dc8cfe6a", size = 5351662, upload-time = "2025-09-22T04:00:35.237Z" },
|
| 1362 |
+
{ url = "https://files.pythonhosted.org/packages/28/a2/de776a573dfb15114509a37351937c367530865edb10a90189d0b4b9b70a/lxml-6.0.2-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:452b899faa64f1805943ec1c0c9ebeaece01a1af83e130b69cdefeda180bb42c", size = 5314973, upload-time = "2025-09-22T04:00:37.086Z" },
|
| 1363 |
+
{ url = "https://files.pythonhosted.org/packages/50/a0/3ae1b1f8964c271b5eec91db2043cf8c6c0bce101ebb2a633b51b044db6c/lxml-6.0.2-cp310-cp310-win32.whl", hash = "sha256:1e786a464c191ca43b133906c6903a7e4d56bef376b75d97ccbb8ec5cf1f0a4b", size = 3611953, upload-time = "2025-09-22T04:00:39.224Z" },
|
| 1364 |
+
{ url = "https://files.pythonhosted.org/packages/d1/70/bd42491f0634aad41bdfc1e46f5cff98825fb6185688dc82baa35d509f1a/lxml-6.0.2-cp310-cp310-win_amd64.whl", hash = "sha256:dacf3c64ef3f7440e3167aa4b49aa9e0fb99e0aa4f9ff03795640bf94531bcb0", size = 4032695, upload-time = "2025-09-22T04:00:41.402Z" },
|
| 1365 |
+
{ url = "https://files.pythonhosted.org/packages/d2/d0/05c6a72299f54c2c561a6c6cbb2f512e047fca20ea97a05e57931f194ac4/lxml-6.0.2-cp310-cp310-win_arm64.whl", hash = "sha256:45f93e6f75123f88d7f0cfd90f2d05f441b808562bf0bc01070a00f53f5028b5", size = 3680051, upload-time = "2025-09-22T04:00:43.525Z" },
|
| 1366 |
+
{ url = "https://files.pythonhosted.org/packages/77/d5/becbe1e2569b474a23f0c672ead8a29ac50b2dc1d5b9de184831bda8d14c/lxml-6.0.2-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:13e35cbc684aadf05d8711a5d1b5857c92e5e580efa9a0d2be197199c8def607", size = 8634365, upload-time = "2025-09-22T04:00:45.672Z" },
|
| 1367 |
+
{ url = "https://files.pythonhosted.org/packages/28/66/1ced58f12e804644426b85d0bb8a4478ca77bc1761455da310505f1a3526/lxml-6.0.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:3b1675e096e17c6fe9c0e8c81434f5736c0739ff9ac6123c87c2d452f48fc938", size = 4650793, upload-time = "2025-09-22T04:00:47.783Z" },
|
| 1368 |
+
{ url = "https://files.pythonhosted.org/packages/11/84/549098ffea39dfd167e3f174b4ce983d0eed61f9d8d25b7bf2a57c3247fc/lxml-6.0.2-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:8ac6e5811ae2870953390452e3476694196f98d447573234592d30488147404d", size = 4944362, upload-time = "2025-09-22T04:00:49.845Z" },
|
| 1369 |
+
{ url = "https://files.pythonhosted.org/packages/ac/bd/f207f16abf9749d2037453d56b643a7471d8fde855a231a12d1e095c4f01/lxml-6.0.2-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:5aa0fc67ae19d7a64c3fe725dc9a1bb11f80e01f78289d05c6f62545affec438", size = 5083152, upload-time = "2025-09-22T04:00:51.709Z" },
|
| 1370 |
+
{ url = "https://files.pythonhosted.org/packages/15/ae/bd813e87d8941d52ad5b65071b1affb48da01c4ed3c9c99e40abb266fbff/lxml-6.0.2-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:de496365750cc472b4e7902a485d3f152ecf57bd3ba03ddd5578ed8ceb4c5964", size = 5023539, upload-time = "2025-09-22T04:00:53.593Z" },
|
| 1371 |
+
{ url = "https://files.pythonhosted.org/packages/02/cd/9bfef16bd1d874fbe0cb51afb00329540f30a3283beb9f0780adbb7eec03/lxml-6.0.2-cp311-cp311-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:200069a593c5e40b8f6fc0d84d86d970ba43138c3e68619ffa234bc9bb806a4d", size = 5344853, upload-time = "2025-09-22T04:00:55.524Z" },
|
| 1372 |
+
{ url = "https://files.pythonhosted.org/packages/b8/89/ea8f91594bc5dbb879734d35a6f2b0ad50605d7fb419de2b63d4211765cc/lxml-6.0.2-cp311-cp311-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7d2de809c2ee3b888b59f995625385f74629707c9355e0ff856445cdcae682b7", size = 5225133, upload-time = "2025-09-22T04:00:57.269Z" },
|
| 1373 |
+
{ url = "https://files.pythonhosted.org/packages/b9/37/9c735274f5dbec726b2db99b98a43950395ba3d4a1043083dba2ad814170/lxml-6.0.2-cp311-cp311-manylinux_2_31_armv7l.whl", hash = "sha256:b2c3da8d93cf5db60e8858c17684c47d01fee6405e554fb55018dd85fc23b178", size = 4677944, upload-time = "2025-09-22T04:00:59.052Z" },
|
| 1374 |
+
{ url = "https://files.pythonhosted.org/packages/20/28/7dfe1ba3475d8bfca3878365075abe002e05d40dfaaeb7ec01b4c587d533/lxml-6.0.2-cp311-cp311-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:442de7530296ef5e188373a1ea5789a46ce90c4847e597856570439621d9c553", size = 5284535, upload-time = "2025-09-22T04:01:01.335Z" },
|
| 1375 |
+
{ url = "https://files.pythonhosted.org/packages/e7/cf/5f14bc0de763498fc29510e3532bf2b4b3a1c1d5d0dff2e900c16ba021ef/lxml-6.0.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:2593c77efde7bfea7f6389f1ab249b15ed4aa5bc5cb5131faa3b843c429fbedb", size = 5067343, upload-time = "2025-09-22T04:01:03.13Z" },
|
| 1376 |
+
{ url = "https://files.pythonhosted.org/packages/1c/b0/bb8275ab5472f32b28cfbbcc6db7c9d092482d3439ca279d8d6fa02f7025/lxml-6.0.2-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:3e3cb08855967a20f553ff32d147e14329b3ae70ced6edc2f282b94afbc74b2a", size = 4725419, upload-time = "2025-09-22T04:01:05.013Z" },
|
| 1377 |
+
{ url = "https://files.pythonhosted.org/packages/25/4c/7c222753bc72edca3b99dbadba1b064209bc8ed4ad448af990e60dcce462/lxml-6.0.2-cp311-cp311-musllinux_1_2_riscv64.whl", hash = "sha256:2ed6c667fcbb8c19c6791bbf40b7268ef8ddf5a96940ba9404b9f9a304832f6c", size = 5275008, upload-time = "2025-09-22T04:01:07.327Z" },
|
| 1378 |
+
{ url = "https://files.pythonhosted.org/packages/6c/8c/478a0dc6b6ed661451379447cdbec77c05741a75736d97e5b2b729687828/lxml-6.0.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:b8f18914faec94132e5b91e69d76a5c1d7b0c73e2489ea8929c4aaa10b76bbf7", size = 5248906, upload-time = "2025-09-22T04:01:09.452Z" },
|
| 1379 |
+
{ url = "https://files.pythonhosted.org/packages/2d/d9/5be3a6ab2784cdf9accb0703b65e1b64fcdd9311c9f007630c7db0cfcce1/lxml-6.0.2-cp311-cp311-win32.whl", hash = "sha256:6605c604e6daa9e0d7f0a2137bdc47a2e93b59c60a65466353e37f8272f47c46", size = 3610357, upload-time = "2025-09-22T04:01:11.102Z" },
|
| 1380 |
+
{ url = "https://files.pythonhosted.org/packages/e2/7d/ca6fb13349b473d5732fb0ee3eec8f6c80fc0688e76b7d79c1008481bf1f/lxml-6.0.2-cp311-cp311-win_amd64.whl", hash = "sha256:e5867f2651016a3afd8dd2c8238baa66f1e2802f44bc17e236f547ace6647078", size = 4036583, upload-time = "2025-09-22T04:01:12.766Z" },
|
| 1381 |
+
{ url = "https://files.pythonhosted.org/packages/ab/a2/51363b5ecd3eab46563645f3a2c3836a2fc67d01a1b87c5017040f39f567/lxml-6.0.2-cp311-cp311-win_arm64.whl", hash = "sha256:4197fb2534ee05fd3e7afaab5d8bfd6c2e186f65ea7f9cd6a82809c887bd1285", size = 3680591, upload-time = "2025-09-22T04:01:14.874Z" },
|
| 1382 |
+
{ url = "https://files.pythonhosted.org/packages/f3/c8/8ff2bc6b920c84355146cd1ab7d181bc543b89241cfb1ebee824a7c81457/lxml-6.0.2-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:a59f5448ba2ceccd06995c95ea59a7674a10de0810f2ce90c9006f3cbc044456", size = 8661887, upload-time = "2025-09-22T04:01:17.265Z" },
|
| 1383 |
+
{ url = "https://files.pythonhosted.org/packages/37/6f/9aae1008083bb501ef63284220ce81638332f9ccbfa53765b2b7502203cf/lxml-6.0.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:e8113639f3296706fbac34a30813929e29247718e88173ad849f57ca59754924", size = 4667818, upload-time = "2025-09-22T04:01:19.688Z" },
|
| 1384 |
+
{ url = "https://files.pythonhosted.org/packages/f1/ca/31fb37f99f37f1536c133476674c10b577e409c0a624384147653e38baf2/lxml-6.0.2-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:a8bef9b9825fa8bc816a6e641bb67219489229ebc648be422af695f6e7a4fa7f", size = 4950807, upload-time = "2025-09-22T04:01:21.487Z" },
|
| 1385 |
+
{ url = "https://files.pythonhosted.org/packages/da/87/f6cb9442e4bada8aab5ae7e1046264f62fdbeaa6e3f6211b93f4c0dd97f1/lxml-6.0.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:65ea18d710fd14e0186c2f973dc60bb52039a275f82d3c44a0e42b43440ea534", size = 5109179, upload-time = "2025-09-22T04:01:23.32Z" },
|
| 1386 |
+
{ url = "https://files.pythonhosted.org/packages/c8/20/a7760713e65888db79bbae4f6146a6ae5c04e4a204a3c48896c408cd6ed2/lxml-6.0.2-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c371aa98126a0d4c739ca93ceffa0fd7a5d732e3ac66a46e74339acd4d334564", size = 5023044, upload-time = "2025-09-22T04:01:25.118Z" },
|
| 1387 |
+
{ url = "https://files.pythonhosted.org/packages/a2/b0/7e64e0460fcb36471899f75831509098f3fd7cd02a3833ac517433cb4f8f/lxml-6.0.2-cp312-cp312-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:700efd30c0fa1a3581d80a748157397559396090a51d306ea59a70020223d16f", size = 5359685, upload-time = "2025-09-22T04:01:27.398Z" },
|
| 1388 |
+
{ url = "https://files.pythonhosted.org/packages/b9/e1/e5df362e9ca4e2f48ed6411bd4b3a0ae737cc842e96877f5bf9428055ab4/lxml-6.0.2-cp312-cp312-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:c33e66d44fe60e72397b487ee92e01da0d09ba2d66df8eae42d77b6d06e5eba0", size = 5654127, upload-time = "2025-09-22T04:01:29.629Z" },
|
| 1389 |
+
{ url = "https://files.pythonhosted.org/packages/c6/d1/232b3309a02d60f11e71857778bfcd4acbdb86c07db8260caf7d008b08f8/lxml-6.0.2-cp312-cp312-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:90a345bbeaf9d0587a3aaffb7006aa39ccb6ff0e96a57286c0cb2fd1520ea192", size = 5253958, upload-time = "2025-09-22T04:01:31.535Z" },
|
| 1390 |
+
{ url = "https://files.pythonhosted.org/packages/35/35/d955a070994725c4f7d80583a96cab9c107c57a125b20bb5f708fe941011/lxml-6.0.2-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:064fdadaf7a21af3ed1dcaa106b854077fbeada827c18f72aec9346847cd65d0", size = 4711541, upload-time = "2025-09-22T04:01:33.801Z" },
|
| 1391 |
+
{ url = "https://files.pythonhosted.org/packages/1e/be/667d17363b38a78c4bd63cfd4b4632029fd68d2c2dc81f25ce9eb5224dd5/lxml-6.0.2-cp312-cp312-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:fbc74f42c3525ac4ffa4b89cbdd00057b6196bcefe8bce794abd42d33a018092", size = 5267426, upload-time = "2025-09-22T04:01:35.639Z" },
|
| 1392 |
+
{ url = "https://files.pythonhosted.org/packages/ea/47/62c70aa4a1c26569bc958c9ca86af2bb4e1f614e8c04fb2989833874f7ae/lxml-6.0.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:6ddff43f702905a4e32bc24f3f2e2edfe0f8fde3277d481bffb709a4cced7a1f", size = 5064917, upload-time = "2025-09-22T04:01:37.448Z" },
|
| 1393 |
+
{ url = "https://files.pythonhosted.org/packages/bd/55/6ceddaca353ebd0f1908ef712c597f8570cc9c58130dbb89903198e441fd/lxml-6.0.2-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:6da5185951d72e6f5352166e3da7b0dc27aa70bd1090b0eb3f7f7212b53f1bb8", size = 4788795, upload-time = "2025-09-22T04:01:39.165Z" },
|
| 1394 |
+
{ url = "https://files.pythonhosted.org/packages/cf/e8/fd63e15da5e3fd4c2146f8bbb3c14e94ab850589beab88e547b2dbce22e1/lxml-6.0.2-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:57a86e1ebb4020a38d295c04fc79603c7899e0df71588043eb218722dabc087f", size = 5676759, upload-time = "2025-09-22T04:01:41.506Z" },
|
| 1395 |
+
{ url = "https://files.pythonhosted.org/packages/76/47/b3ec58dc5c374697f5ba37412cd2728f427d056315d124dd4b61da381877/lxml-6.0.2-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:2047d8234fe735ab77802ce5f2297e410ff40f5238aec569ad7c8e163d7b19a6", size = 5255666, upload-time = "2025-09-22T04:01:43.363Z" },
|
| 1396 |
+
{ url = "https://files.pythonhosted.org/packages/19/93/03ba725df4c3d72afd9596eef4a37a837ce8e4806010569bedfcd2cb68fd/lxml-6.0.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:6f91fd2b2ea15a6800c8e24418c0775a1694eefc011392da73bc6cef2623b322", size = 5277989, upload-time = "2025-09-22T04:01:45.215Z" },
|
| 1397 |
+
{ url = "https://files.pythonhosted.org/packages/c6/80/c06de80bfce881d0ad738576f243911fccf992687ae09fd80b734712b39c/lxml-6.0.2-cp312-cp312-win32.whl", hash = "sha256:3ae2ce7d6fedfb3414a2b6c5e20b249c4c607f72cb8d2bb7cc9c6ec7c6f4e849", size = 3611456, upload-time = "2025-09-22T04:01:48.243Z" },
|
| 1398 |
+
{ url = "https://files.pythonhosted.org/packages/f7/d7/0cdfb6c3e30893463fb3d1e52bc5f5f99684a03c29a0b6b605cfae879cd5/lxml-6.0.2-cp312-cp312-win_amd64.whl", hash = "sha256:72c87e5ee4e58a8354fb9c7c84cbf95a1c8236c127a5d1b7683f04bed8361e1f", size = 4011793, upload-time = "2025-09-22T04:01:50.042Z" },
|
| 1399 |
+
{ url = "https://files.pythonhosted.org/packages/ea/7b/93c73c67db235931527301ed3785f849c78991e2e34f3fd9a6663ffda4c5/lxml-6.0.2-cp312-cp312-win_arm64.whl", hash = "sha256:61cb10eeb95570153e0c0e554f58df92ecf5109f75eacad4a95baa709e26c3d6", size = 3672836, upload-time = "2025-09-22T04:01:52.145Z" },
|
| 1400 |
+
{ url = "https://files.pythonhosted.org/packages/53/fd/4e8f0540608977aea078bf6d79f128e0e2c2bba8af1acf775c30baa70460/lxml-6.0.2-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:9b33d21594afab46f37ae58dfadd06636f154923c4e8a4d754b0127554eb2e77", size = 8648494, upload-time = "2025-09-22T04:01:54.242Z" },
|
| 1401 |
+
{ url = "https://files.pythonhosted.org/packages/5d/f4/2a94a3d3dfd6c6b433501b8d470a1960a20ecce93245cf2db1706adf6c19/lxml-6.0.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:6c8963287d7a4c5c9a432ff487c52e9c5618667179c18a204bdedb27310f022f", size = 4661146, upload-time = "2025-09-22T04:01:56.282Z" },
|
| 1402 |
+
{ url = "https://files.pythonhosted.org/packages/25/2e/4efa677fa6b322013035d38016f6ae859d06cac67437ca7dc708a6af7028/lxml-6.0.2-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:1941354d92699fb5ffe6ed7b32f9649e43c2feb4b97205f75866f7d21aa91452", size = 4946932, upload-time = "2025-09-22T04:01:58.989Z" },
|
| 1403 |
+
{ url = "https://files.pythonhosted.org/packages/ce/0f/526e78a6d38d109fdbaa5049c62e1d32fdd70c75fb61c4eadf3045d3d124/lxml-6.0.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:bb2f6ca0ae2d983ded09357b84af659c954722bbf04dea98030064996d156048", size = 5100060, upload-time = "2025-09-22T04:02:00.812Z" },
|
| 1404 |
+
{ url = "https://files.pythonhosted.org/packages/81/76/99de58d81fa702cc0ea7edae4f4640416c2062813a00ff24bd70ac1d9c9b/lxml-6.0.2-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:eb2a12d704f180a902d7fa778c6d71f36ceb7b0d317f34cdc76a5d05aa1dd1df", size = 5019000, upload-time = "2025-09-22T04:02:02.671Z" },
|
| 1405 |
+
{ url = "https://files.pythonhosted.org/packages/b5/35/9e57d25482bc9a9882cb0037fdb9cc18f4b79d85df94fa9d2a89562f1d25/lxml-6.0.2-cp313-cp313-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:6ec0e3f745021bfed19c456647f0298d60a24c9ff86d9d051f52b509663feeb1", size = 5348496, upload-time = "2025-09-22T04:02:04.904Z" },
|
| 1406 |
+
{ url = "https://files.pythonhosted.org/packages/a6/8e/cb99bd0b83ccc3e8f0f528e9aa1f7a9965dfec08c617070c5db8d63a87ce/lxml-6.0.2-cp313-cp313-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:846ae9a12d54e368933b9759052d6206a9e8b250291109c48e350c1f1f49d916", size = 5643779, upload-time = "2025-09-22T04:02:06.689Z" },
|
| 1407 |
+
{ url = "https://files.pythonhosted.org/packages/d0/34/9e591954939276bb679b73773836c6684c22e56d05980e31d52a9a8deb18/lxml-6.0.2-cp313-cp313-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ef9266d2aa545d7374938fb5c484531ef5a2ec7f2d573e62f8ce722c735685fd", size = 5244072, upload-time = "2025-09-22T04:02:08.587Z" },
|
| 1408 |
+
{ url = "https://files.pythonhosted.org/packages/8d/27/b29ff065f9aaca443ee377aff699714fcbffb371b4fce5ac4ca759e436d5/lxml-6.0.2-cp313-cp313-manylinux_2_31_armv7l.whl", hash = "sha256:4077b7c79f31755df33b795dc12119cb557a0106bfdab0d2c2d97bd3cf3dffa6", size = 4718675, upload-time = "2025-09-22T04:02:10.783Z" },
|
| 1409 |
+
{ url = "https://files.pythonhosted.org/packages/2b/9f/f756f9c2cd27caa1a6ef8c32ae47aadea697f5c2c6d07b0dae133c244fbe/lxml-6.0.2-cp313-cp313-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:a7c5d5e5f1081955358533be077166ee97ed2571d6a66bdba6ec2f609a715d1a", size = 5255171, upload-time = "2025-09-22T04:02:12.631Z" },
|
| 1410 |
+
{ url = "https://files.pythonhosted.org/packages/61/46/bb85ea42d2cb1bd8395484fd72f38e3389611aa496ac7772da9205bbda0e/lxml-6.0.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:8f8d0cbd0674ee89863a523e6994ac25fd5be9c8486acfc3e5ccea679bad2679", size = 5057175, upload-time = "2025-09-22T04:02:14.718Z" },
|
| 1411 |
+
{ url = "https://files.pythonhosted.org/packages/95/0c/443fc476dcc8e41577f0af70458c50fe299a97bb6b7505bb1ae09aa7f9ac/lxml-6.0.2-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:2cbcbf6d6e924c28f04a43f3b6f6e272312a090f269eff68a2982e13e5d57659", size = 4785688, upload-time = "2025-09-22T04:02:16.957Z" },
|
| 1412 |
+
{ url = "https://files.pythonhosted.org/packages/48/78/6ef0b359d45bb9697bc5a626e1992fa5d27aa3f8004b137b2314793b50a0/lxml-6.0.2-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:dfb874cfa53340009af6bdd7e54ebc0d21012a60a4e65d927c2e477112e63484", size = 5660655, upload-time = "2025-09-22T04:02:18.815Z" },
|
| 1413 |
+
{ url = "https://files.pythonhosted.org/packages/ff/ea/e1d33808f386bc1339d08c0dcada6e4712d4ed8e93fcad5f057070b7988a/lxml-6.0.2-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:fb8dae0b6b8b7f9e96c26fdd8121522ce5de9bb5538010870bd538683d30e9a2", size = 5247695, upload-time = "2025-09-22T04:02:20.593Z" },
|
| 1414 |
+
{ url = "https://files.pythonhosted.org/packages/4f/47/eba75dfd8183673725255247a603b4ad606f4ae657b60c6c145b381697da/lxml-6.0.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:358d9adae670b63e95bc59747c72f4dc97c9ec58881d4627fe0120da0f90d314", size = 5269841, upload-time = "2025-09-22T04:02:22.489Z" },
|
| 1415 |
+
{ url = "https://files.pythonhosted.org/packages/76/04/5c5e2b8577bc936e219becb2e98cdb1aca14a4921a12995b9d0c523502ae/lxml-6.0.2-cp313-cp313-win32.whl", hash = "sha256:e8cd2415f372e7e5a789d743d133ae474290a90b9023197fd78f32e2dc6873e2", size = 3610700, upload-time = "2025-09-22T04:02:24.465Z" },
|
| 1416 |
+
{ url = "https://files.pythonhosted.org/packages/fe/0a/4643ccc6bb8b143e9f9640aa54e38255f9d3b45feb2cbe7ae2ca47e8782e/lxml-6.0.2-cp313-cp313-win_amd64.whl", hash = "sha256:b30d46379644fbfc3ab81f8f82ae4de55179414651f110a1514f0b1f8f6cb2d7", size = 4010347, upload-time = "2025-09-22T04:02:26.286Z" },
|
| 1417 |
+
{ url = "https://files.pythonhosted.org/packages/31/ef/dcf1d29c3f530577f61e5fe2f1bd72929acf779953668a8a47a479ae6f26/lxml-6.0.2-cp313-cp313-win_arm64.whl", hash = "sha256:13dcecc9946dca97b11b7c40d29fba63b55ab4170d3c0cf8c0c164343b9bfdcf", size = 3671248, upload-time = "2025-09-22T04:02:27.918Z" },
|
| 1418 |
+
{ url = "https://files.pythonhosted.org/packages/03/15/d4a377b385ab693ce97b472fe0c77c2b16ec79590e688b3ccc71fba19884/lxml-6.0.2-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:b0c732aa23de8f8aec23f4b580d1e52905ef468afb4abeafd3fec77042abb6fe", size = 8659801, upload-time = "2025-09-22T04:02:30.113Z" },
|
| 1419 |
+
{ url = "https://files.pythonhosted.org/packages/c8/e8/c128e37589463668794d503afaeb003987373c5f94d667124ffd8078bbd9/lxml-6.0.2-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:4468e3b83e10e0317a89a33d28f7aeba1caa4d1a6fd457d115dd4ffe90c5931d", size = 4659403, upload-time = "2025-09-22T04:02:32.119Z" },
|
| 1420 |
+
{ url = "https://files.pythonhosted.org/packages/00/ce/74903904339decdf7da7847bb5741fc98a5451b42fc419a86c0c13d26fe2/lxml-6.0.2-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:abd44571493973bad4598a3be7e1d807ed45aa2adaf7ab92ab7c62609569b17d", size = 4966974, upload-time = "2025-09-22T04:02:34.155Z" },
|
| 1421 |
+
{ url = "https://files.pythonhosted.org/packages/1f/d3/131dec79ce61c5567fecf82515bd9bc36395df42501b50f7f7f3bd065df0/lxml-6.0.2-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:370cd78d5855cfbffd57c422851f7d3864e6ae72d0da615fca4dad8c45d375a5", size = 5102953, upload-time = "2025-09-22T04:02:36.054Z" },
|
| 1422 |
+
{ url = "https://files.pythonhosted.org/packages/3a/ea/a43ba9bb750d4ffdd885f2cd333572f5bb900cd2408b67fdda07e85978a0/lxml-6.0.2-cp314-cp314-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:901e3b4219fa04ef766885fb40fa516a71662a4c61b80c94d25336b4934b71c0", size = 5055054, upload-time = "2025-09-22T04:02:38.154Z" },
|
| 1423 |
+
{ url = "https://files.pythonhosted.org/packages/60/23/6885b451636ae286c34628f70a7ed1fcc759f8d9ad382d132e1c8d3d9bfd/lxml-6.0.2-cp314-cp314-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:a4bf42d2e4cf52c28cc1812d62426b9503cdb0c87a6de81442626aa7d69707ba", size = 5352421, upload-time = "2025-09-22T04:02:40.413Z" },
|
| 1424 |
+
{ url = "https://files.pythonhosted.org/packages/48/5b/fc2ddfc94ddbe3eebb8e9af6e3fd65e2feba4967f6a4e9683875c394c2d8/lxml-6.0.2-cp314-cp314-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:b2c7fdaa4d7c3d886a42534adec7cfac73860b89b4e5298752f60aa5984641a0", size = 5673684, upload-time = "2025-09-22T04:02:42.288Z" },
|
| 1425 |
+
{ url = "https://files.pythonhosted.org/packages/29/9c/47293c58cc91769130fbf85531280e8cc7868f7fbb6d92f4670071b9cb3e/lxml-6.0.2-cp314-cp314-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:98a5e1660dc7de2200b00d53fa00bcd3c35a3608c305d45a7bbcaf29fa16e83d", size = 5252463, upload-time = "2025-09-22T04:02:44.165Z" },
|
| 1426 |
+
{ url = "https://files.pythonhosted.org/packages/9b/da/ba6eceb830c762b48e711ded880d7e3e89fc6c7323e587c36540b6b23c6b/lxml-6.0.2-cp314-cp314-manylinux_2_31_armv7l.whl", hash = "sha256:dc051506c30b609238d79eda75ee9cab3e520570ec8219844a72a46020901e37", size = 4698437, upload-time = "2025-09-22T04:02:46.524Z" },
|
| 1427 |
+
{ url = "https://files.pythonhosted.org/packages/a5/24/7be3f82cb7990b89118d944b619e53c656c97dc89c28cfb143fdb7cd6f4d/lxml-6.0.2-cp314-cp314-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:8799481bbdd212470d17513a54d568f44416db01250f49449647b5ab5b5dccb9", size = 5269890, upload-time = "2025-09-22T04:02:48.812Z" },
|
| 1428 |
+
{ url = "https://files.pythonhosted.org/packages/1b/bd/dcfb9ea1e16c665efd7538fc5d5c34071276ce9220e234217682e7d2c4a5/lxml-6.0.2-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:9261bb77c2dab42f3ecd9103951aeca2c40277701eb7e912c545c1b16e0e4917", size = 5097185, upload-time = "2025-09-22T04:02:50.746Z" },
|
| 1429 |
+
{ url = "https://files.pythonhosted.org/packages/21/04/a60b0ff9314736316f28316b694bccbbabe100f8483ad83852d77fc7468e/lxml-6.0.2-cp314-cp314-musllinux_1_2_armv7l.whl", hash = "sha256:65ac4a01aba353cfa6d5725b95d7aed6356ddc0a3cd734de00124d285b04b64f", size = 4745895, upload-time = "2025-09-22T04:02:52.968Z" },
|
| 1430 |
+
{ url = "https://files.pythonhosted.org/packages/d6/bd/7d54bd1846e5a310d9c715921c5faa71cf5c0853372adf78aee70c8d7aa2/lxml-6.0.2-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:b22a07cbb82fea98f8a2fd814f3d1811ff9ed76d0fc6abc84eb21527596e7cc8", size = 5695246, upload-time = "2025-09-22T04:02:54.798Z" },
|
| 1431 |
+
{ url = "https://files.pythonhosted.org/packages/fd/32/5643d6ab947bc371da21323acb2a6e603cedbe71cb4c99c8254289ab6f4e/lxml-6.0.2-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:d759cdd7f3e055d6bc8d9bec3ad905227b2e4c785dc16c372eb5b5e83123f48a", size = 5260797, upload-time = "2025-09-22T04:02:57.058Z" },
|
| 1432 |
+
{ url = "https://files.pythonhosted.org/packages/33/da/34c1ec4cff1eea7d0b4cd44af8411806ed943141804ac9c5d565302afb78/lxml-6.0.2-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:945da35a48d193d27c188037a05fec5492937f66fb1958c24fc761fb9d40d43c", size = 5277404, upload-time = "2025-09-22T04:02:58.966Z" },
|
| 1433 |
+
{ url = "https://files.pythonhosted.org/packages/82/57/4eca3e31e54dc89e2c3507e1cd411074a17565fa5ffc437c4ae0a00d439e/lxml-6.0.2-cp314-cp314-win32.whl", hash = "sha256:be3aaa60da67e6153eb15715cc2e19091af5dc75faef8b8a585aea372507384b", size = 3670072, upload-time = "2025-09-22T04:03:38.05Z" },
|
| 1434 |
+
{ url = "https://files.pythonhosted.org/packages/e3/e0/c96cf13eccd20c9421ba910304dae0f619724dcf1702864fd59dd386404d/lxml-6.0.2-cp314-cp314-win_amd64.whl", hash = "sha256:fa25afbadead523f7001caf0c2382afd272c315a033a7b06336da2637d92d6ed", size = 4080617, upload-time = "2025-09-22T04:03:39.835Z" },
|
| 1435 |
+
{ url = "https://files.pythonhosted.org/packages/d5/5d/b3f03e22b3d38d6f188ef044900a9b29b2fe0aebb94625ce9fe244011d34/lxml-6.0.2-cp314-cp314-win_arm64.whl", hash = "sha256:063eccf89df5b24e361b123e257e437f9e9878f425ee9aae3144c77faf6da6d8", size = 3754930, upload-time = "2025-09-22T04:03:41.565Z" },
|
| 1436 |
+
{ url = "https://files.pythonhosted.org/packages/5e/5c/42c2c4c03554580708fc738d13414801f340c04c3eff90d8d2d227145275/lxml-6.0.2-cp314-cp314t-macosx_10_13_universal2.whl", hash = "sha256:6162a86d86893d63084faaf4ff937b3daea233e3682fb4474db07395794fa80d", size = 8910380, upload-time = "2025-09-22T04:03:01.645Z" },
|
| 1437 |
+
{ url = "https://files.pythonhosted.org/packages/bf/4f/12df843e3e10d18d468a7557058f8d3733e8b6e12401f30b1ef29360740f/lxml-6.0.2-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:414aaa94e974e23a3e92e7ca5b97d10c0cf37b6481f50911032c69eeb3991bba", size = 4775632, upload-time = "2025-09-22T04:03:03.814Z" },
|
| 1438 |
+
{ url = "https://files.pythonhosted.org/packages/e4/0c/9dc31e6c2d0d418483cbcb469d1f5a582a1cd00a1f4081953d44051f3c50/lxml-6.0.2-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:48461bd21625458dd01e14e2c38dd0aea69addc3c4f960c30d9f59d7f93be601", size = 4975171, upload-time = "2025-09-22T04:03:05.651Z" },
|
| 1439 |
+
{ url = "https://files.pythonhosted.org/packages/e7/2b/9b870c6ca24c841bdd887504808f0417aa9d8d564114689266f19ddf29c8/lxml-6.0.2-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:25fcc59afc57d527cfc78a58f40ab4c9b8fd096a9a3f964d2781ffb6eb33f4ed", size = 5110109, upload-time = "2025-09-22T04:03:07.452Z" },
|
| 1440 |
+
{ url = "https://files.pythonhosted.org/packages/bf/0c/4f5f2a4dd319a178912751564471355d9019e220c20d7db3fb8307ed8582/lxml-6.0.2-cp314-cp314t-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5179c60288204e6ddde3f774a93350177e08876eaf3ab78aa3a3649d43eb7d37", size = 5041061, upload-time = "2025-09-22T04:03:09.297Z" },
|
| 1441 |
+
{ url = "https://files.pythonhosted.org/packages/12/64/554eed290365267671fe001a20d72d14f468ae4e6acef1e179b039436967/lxml-6.0.2-cp314-cp314t-manylinux_2_26_i686.manylinux_2_28_i686.whl", hash = "sha256:967aab75434de148ec80597b75062d8123cadf2943fb4281f385141e18b21338", size = 5306233, upload-time = "2025-09-22T04:03:11.651Z" },
|
| 1442 |
+
{ url = "https://files.pythonhosted.org/packages/7a/31/1d748aa275e71802ad9722df32a7a35034246b42c0ecdd8235412c3396ef/lxml-6.0.2-cp314-cp314t-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:d100fcc8930d697c6561156c6810ab4a508fb264c8b6779e6e61e2ed5e7558f9", size = 5604739, upload-time = "2025-09-22T04:03:13.592Z" },
|
| 1443 |
+
{ url = "https://files.pythonhosted.org/packages/8f/41/2c11916bcac09ed561adccacceaedd2bf0e0b25b297ea92aab99fd03d0fa/lxml-6.0.2-cp314-cp314t-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2ca59e7e13e5981175b8b3e4ab84d7da57993eeff53c07764dcebda0d0e64ecd", size = 5225119, upload-time = "2025-09-22T04:03:15.408Z" },
|
| 1444 |
+
{ url = "https://files.pythonhosted.org/packages/99/05/4e5c2873d8f17aa018e6afde417c80cc5d0c33be4854cce3ef5670c49367/lxml-6.0.2-cp314-cp314t-manylinux_2_31_armv7l.whl", hash = "sha256:957448ac63a42e2e49531b9d6c0fa449a1970dbc32467aaad46f11545be9af1d", size = 4633665, upload-time = "2025-09-22T04:03:17.262Z" },
|
| 1445 |
+
{ url = "https://files.pythonhosted.org/packages/0f/c9/dcc2da1bebd6275cdc723b515f93edf548b82f36a5458cca3578bc899332/lxml-6.0.2-cp314-cp314t-manylinux_2_38_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:b7fc49c37f1786284b12af63152fe1d0990722497e2d5817acfe7a877522f9a9", size = 5234997, upload-time = "2025-09-22T04:03:19.14Z" },
|
| 1446 |
+
{ url = "https://files.pythonhosted.org/packages/9c/e2/5172e4e7468afca64a37b81dba152fc5d90e30f9c83c7c3213d6a02a5ce4/lxml-6.0.2-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:e19e0643cc936a22e837f79d01a550678da8377d7d801a14487c10c34ee49c7e", size = 5090957, upload-time = "2025-09-22T04:03:21.436Z" },
|
| 1447 |
+
{ url = "https://files.pythonhosted.org/packages/a5/b3/15461fd3e5cd4ddcb7938b87fc20b14ab113b92312fc97afe65cd7c85de1/lxml-6.0.2-cp314-cp314t-musllinux_1_2_armv7l.whl", hash = "sha256:1db01e5cf14345628e0cbe71067204db658e2fb8e51e7f33631f5f4735fefd8d", size = 4764372, upload-time = "2025-09-22T04:03:23.27Z" },
|
| 1448 |
+
{ url = "https://files.pythonhosted.org/packages/05/33/f310b987c8bf9e61c4dd8e8035c416bd3230098f5e3cfa69fc4232de7059/lxml-6.0.2-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:875c6b5ab39ad5291588aed6925fac99d0097af0dd62f33c7b43736043d4a2ec", size = 5634653, upload-time = "2025-09-22T04:03:25.767Z" },
|
| 1449 |
+
{ url = "https://files.pythonhosted.org/packages/70/ff/51c80e75e0bc9382158133bdcf4e339b5886c6ee2418b5199b3f1a61ed6d/lxml-6.0.2-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:cdcbed9ad19da81c480dfd6dd161886db6096083c9938ead313d94b30aadf272", size = 5233795, upload-time = "2025-09-22T04:03:27.62Z" },
|
| 1450 |
+
{ url = "https://files.pythonhosted.org/packages/56/4d/4856e897df0d588789dd844dbed9d91782c4ef0b327f96ce53c807e13128/lxml-6.0.2-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:80dadc234ebc532e09be1975ff538d154a7fa61ea5031c03d25178855544728f", size = 5257023, upload-time = "2025-09-22T04:03:30.056Z" },
|
| 1451 |
+
{ url = "https://files.pythonhosted.org/packages/0f/85/86766dfebfa87bea0ab78e9ff7a4b4b45225df4b4d3b8cc3c03c5cd68464/lxml-6.0.2-cp314-cp314t-win32.whl", hash = "sha256:da08e7bb297b04e893d91087df19638dc7a6bb858a954b0cc2b9f5053c922312", size = 3911420, upload-time = "2025-09-22T04:03:32.198Z" },
|
| 1452 |
+
{ url = "https://files.pythonhosted.org/packages/fe/1a/b248b355834c8e32614650b8008c69ffeb0ceb149c793961dd8c0b991bb3/lxml-6.0.2-cp314-cp314t-win_amd64.whl", hash = "sha256:252a22982dca42f6155125ac76d3432e548a7625d56f5a273ee78a5057216eca", size = 4406837, upload-time = "2025-09-22T04:03:34.027Z" },
|
| 1453 |
+
{ url = "https://files.pythonhosted.org/packages/92/aa/df863bcc39c5e0946263454aba394de8a9084dbaff8ad143846b0d844739/lxml-6.0.2-cp314-cp314t-win_arm64.whl", hash = "sha256:bb4c1847b303835d89d785a18801a883436cdfd5dc3d62947f9c49e24f0f5a2c", size = 3822205, upload-time = "2025-09-22T04:03:36.249Z" },
|
| 1454 |
+
{ url = "https://files.pythonhosted.org/packages/e7/9c/780c9a8fce3f04690b374f72f41306866b0400b9d0fdf3e17aaa37887eed/lxml-6.0.2-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:e748d4cf8fef2526bb2a589a417eba0c8674e29ffcb570ce2ceca44f1e567bf6", size = 3939264, upload-time = "2025-09-22T04:04:32.892Z" },
|
| 1455 |
+
{ url = "https://files.pythonhosted.org/packages/f5/5a/1ab260c00adf645d8bf7dec7f920f744b032f69130c681302821d5debea6/lxml-6.0.2-pp310-pypy310_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:4ddb1049fa0579d0cbd00503ad8c58b9ab34d1254c77bc6a5576d96ec7853dba", size = 4216435, upload-time = "2025-09-22T04:04:34.907Z" },
|
| 1456 |
+
{ url = "https://files.pythonhosted.org/packages/f2/37/565f3b3d7ffede22874b6d86be1a1763d00f4ea9fc5b9b6ccb11e4ec8612/lxml-6.0.2-pp310-pypy310_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:cb233f9c95f83707dae461b12b720c1af9c28c2d19208e1be03387222151daf5", size = 4325913, upload-time = "2025-09-22T04:04:37.205Z" },
|
| 1457 |
+
{ url = "https://files.pythonhosted.org/packages/22/ec/f3a1b169b2fb9d03467e2e3c0c752ea30e993be440a068b125fc7dd248b0/lxml-6.0.2-pp310-pypy310_pp73-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:bc456d04db0515ce3320d714a1eac7a97774ff0849e7718b492d957da4631dd4", size = 4269357, upload-time = "2025-09-22T04:04:39.322Z" },
|
| 1458 |
+
{ url = "https://files.pythonhosted.org/packages/77/a2/585a28fe3e67daa1cf2f06f34490d556d121c25d500b10082a7db96e3bcd/lxml-6.0.2-pp310-pypy310_pp73-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2613e67de13d619fd283d58bda40bff0ee07739f624ffee8b13b631abf33083d", size = 4412295, upload-time = "2025-09-22T04:04:41.647Z" },
|
| 1459 |
+
{ url = "https://files.pythonhosted.org/packages/7b/d9/a57dd8bcebd7c69386c20263830d4fa72d27e6b72a229ef7a48e88952d9a/lxml-6.0.2-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:24a8e756c982c001ca8d59e87c80c4d9dcd4d9b44a4cbeb8d9be4482c514d41d", size = 3516913, upload-time = "2025-09-22T04:04:43.602Z" },
|
| 1460 |
+
{ url = "https://files.pythonhosted.org/packages/0b/11/29d08bc103a62c0eba8016e7ed5aeebbf1e4312e83b0b1648dd203b0e87d/lxml-6.0.2-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:1c06035eafa8404b5cf475bb37a9f6088b0aca288d4ccc9d69389750d5543700", size = 3949829, upload-time = "2025-09-22T04:04:45.608Z" },
|
| 1461 |
+
{ url = "https://files.pythonhosted.org/packages/12/b3/52ab9a3b31e5ab8238da241baa19eec44d2ab426532441ee607165aebb52/lxml-6.0.2-pp311-pypy311_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:c7d13103045de1bdd6fe5d61802565f1a3537d70cd3abf596aa0af62761921ee", size = 4226277, upload-time = "2025-09-22T04:04:47.754Z" },
|
| 1462 |
+
{ url = "https://files.pythonhosted.org/packages/a0/33/1eaf780c1baad88224611df13b1c2a9dfa460b526cacfe769103ff50d845/lxml-6.0.2-pp311-pypy311_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:0a3c150a95fbe5ac91de323aa756219ef9cf7fde5a3f00e2281e30f33fa5fa4f", size = 4330433, upload-time = "2025-09-22T04:04:49.907Z" },
|
| 1463 |
+
{ url = "https://files.pythonhosted.org/packages/7a/c1/27428a2ff348e994ab4f8777d3a0ad510b6b92d37718e5887d2da99952a2/lxml-6.0.2-pp311-pypy311_pp73-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:60fa43be34f78bebb27812ed90f1925ec99560b0fa1decdb7d12b84d857d31e9", size = 4272119, upload-time = "2025-09-22T04:04:51.801Z" },
|
| 1464 |
+
{ url = "https://files.pythonhosted.org/packages/f0/d0/3020fa12bcec4ab62f97aab026d57c2f0cfd480a558758d9ca233bb6a79d/lxml-6.0.2-pp311-pypy311_pp73-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:21c73b476d3cfe836be731225ec3421fa2f048d84f6df6a8e70433dff1376d5a", size = 4417314, upload-time = "2025-09-22T04:04:55.024Z" },
|
| 1465 |
+
{ url = "https://files.pythonhosted.org/packages/6c/77/d7f491cbc05303ac6801651aabeb262d43f319288c1ea96c66b1d2692ff3/lxml-6.0.2-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:27220da5be049e936c3aca06f174e8827ca6445a4353a1995584311487fc4e3e", size = 3518768, upload-time = "2025-09-22T04:04:57.097Z" },
|
| 1466 |
+
]
|
| 1467 |
+
|
| 1468 |
[[package]]
|
| 1469 |
name = "markdown-it-py"
|
| 1470 |
version = "4.0.0"
|
|
|
|
| 2021 |
version = "0.1.0"
|
| 2022 |
source = { editable = "." }
|
| 2023 |
dependencies = [
|
| 2024 |
+
{ name = "beautifulsoup4" },
|
| 2025 |
{ name = "fastapi" },
|
| 2026 |
+
{ name = "lxml" },
|
| 2027 |
{ name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
|
| 2028 |
{ name = "numpy", version = "2.4.4", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
|
| 2029 |
{ name = "openai" },
|
|
|
|
| 2046 |
|
| 2047 |
[package.metadata]
|
| 2048 |
requires-dist = [
|
| 2049 |
+
{ name = "beautifulsoup4", specifier = ">=4.14.3" },
|
| 2050 |
{ name = "fastapi", specifier = ">=0.100.0" },
|
| 2051 |
+
{ name = "lxml", specifier = ">=6.0.2" },
|
| 2052 |
{ name = "numpy", specifier = ">=1.24.0" },
|
| 2053 |
{ name = "openai", specifier = ">=1.0.0" },
|
| 2054 |
{ name = "openenv-core", extras = ["core"], specifier = ">=0.2.2" },
|
|
|
|
| 3514 |
{ url = "https://files.pythonhosted.org/packages/e9/44/75a9c9421471a6c4805dbf2356f7c181a29c1879239abab1ea2cc8f38b40/sniffio-1.3.1-py3-none-any.whl", hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2", size = 10235, upload-time = "2024-02-25T23:20:01.196Z" },
|
| 3515 |
]
|
| 3516 |
|
| 3517 |
+
[[package]]
|
| 3518 |
+
name = "soupsieve"
|
| 3519 |
+
version = "2.8.3"
|
| 3520 |
+
source = { registry = "https://pypi.org/simple" }
|
| 3521 |
+
sdist = { url = "https://files.pythonhosted.org/packages/7b/ae/2d9c981590ed9999a0d91755b47fc74f74de286b0f5cee14c9269041e6c4/soupsieve-2.8.3.tar.gz", hash = "sha256:3267f1eeea4251fb42728b6dfb746edc9acaffc4a45b27e19450b676586e8349", size = 118627, upload-time = "2026-01-20T04:27:02.457Z" }
|
| 3522 |
+
wheels = [
|
| 3523 |
+
{ url = "https://files.pythonhosted.org/packages/46/2c/1462b1d0a634697ae9e55b3cecdcb64788e8b7d63f54d923fcd0bb140aed/soupsieve-2.8.3-py3-none-any.whl", hash = "sha256:ed64f2ba4eebeab06cc4962affce381647455978ffc1e36bb79a545b91f45a95", size = 37016, upload-time = "2026-01-20T04:27:01.012Z" },
|
| 3524 |
+
]
|
| 3525 |
+
|
| 3526 |
[[package]]
|
| 3527 |
name = "sse-starlette"
|
| 3528 |
version = "3.3.4"
|