Spaces:

sanjayvk21
/

OpenEnv-Auctioneer

Sleeping

App Files Files Community

Ikshitha Janarthanan commited on 21 days ago

Commit

3e3afc7

1 Parent(s): b1f8065

feat:task 3 enhanced

Browse files

Files changed (5) hide show

README.md +155 -5
app.py +4 -2
inference.py +56 -9
models.py +54 -13
requirements.txt +15 -5

README.md CHANGED Viewed

@@ -28,6 +28,8 @@ with a fully open, dataset-calibrated simulation grounded in:
 | [MIND](https://msnews.github.io/) (Microsoft News) | CTR calibration + headline catalog |
 | [iPinYou RTB](https://contest.ipinyou.com/) | Competitor bid distributions (Lognormal/hour) |
 | [Vogue Dialogue](https://github.com/aimagelab/Vogue-Dialogue) | User persona bank |
 All datasets are **optional** — the environment falls back to published
 statistics so it runs out-of-the-box with zero downloads.
@@ -41,7 +43,8 @@ class Action(BaseModel):
     bid_price: float          # USD bid for the RTB auction (≥ 0)
     headline_id: int          # Index into the 6-slot headlines catalog (0–5)
     creative_id: int          # Index into the 6-slot creatives catalog (0–5)
-    generated_caption: str | None  # Free-text caption (hard_assembly only)
 ```
 ## Observation Space
@@ -60,6 +63,11 @@ class Observation(BaseModel):
     carryover_boost: float    # Brand-recall CTR boost [0, 0.30]
     last_ctr: float           # Previous step CTR
     cumulative_revenue: float # Total revenue earned
 ```
 ## Reward Signal
@@ -69,6 +77,7 @@ class Observation(BaseModel):
 | Auction **won** | `adjusted_ctr × $15 − clearing_price` |
 | Auction **lost** | `−$0.10` (missed opportunity) |
 | Over-pacing (medium only) | `−$1.00` penalty |
 Rewards are **per-step** (not sparse), providing continuous gradient signal.
@@ -84,9 +93,33 @@ Rewards are **per-step** (not sparse), providing continuous gradient signal.
 **Objective:** Pace $50 across 24 hours; retain ≥ 20% for peak hours (18–22).
 **Budget:** $50 | **Grader:** `0.3×smoothness + 0.3×peak_survival + 0.4×revenue` | **Target:** 0.70
-### Level 3 — `hard_assembly` (Hard)
-**Objective:** Generate captions aligned with the viral trend AND win auctions.
-**Budget:** $100 | **Grader:** `0.6×cosine_sim + 0.4×revenue_factor` | **Target:** 0.65
 ### Level 4 — `hard_sequencing` (Hard)
 **Objective:** Plan 24-hour ad placements with carry-over brand-recall boosts.
@@ -96,6 +129,102 @@ a 20% diversity bonus.
 ---
 ## Setup & Usage
 ### Prerequisites
@@ -147,6 +276,7 @@ The inference script emits standardised `[START]`/`[STEP]`/`[END]` logs to stdou
 | `LOCAL_IMAGE_NAME` | Yes (inference) | Docker image name |
 | `AUCTIONEER_TASK` | No | Task to run (default: `all`) |
 | `MIND_SOURCE` | No | `local` / `huggingface` / `azure` |
 | `USE_LLM_SIMULATOR` | No | Set `1` to enable Llama-3 User Simulator |
 ---
@@ -157,7 +287,7 @@ The inference script emits standardised `[START]`/`[STEP]`/`[END]` logs to stdou
 |------|---------------|-------|
 | `easy_headline` | 0.55 – 0.80 | Context→headline matching is learnable |
 | `medium_pacing` | 0.45 – 0.70 | Requires budget discipline |
-| `hard_assembly` | 0.40 – 0.65 | Caption quality + auction wins |
 | `hard_sequencing` | 0.35 – 0.60 | Compared against DP oracle |
 Scores depend on LLM quality and market stochasticity.  Run multiple episodes
@@ -170,6 +300,19 @@ for stable estimates.
 ```
 ├── models.py          # Pydantic models: Action, Observation, Reward, Info
 ├── environment.py     # OpenEnvAuctioneer + graders + dataset layers
 ├── app.py             # FastAPI server (runs inside Docker)
 ├── inference.py       # Baseline inference script (mandatory format)
 ├── openenv.yaml       # OpenEnv metadata & task definitions
@@ -179,6 +322,13 @@ for stable estimates.
 └── Datasets/          # Optional dataset mount point
 ```
 ## License
 MIT

 | [MIND](https://msnews.github.io/) (Microsoft News) | CTR calibration + headline catalog |
 | [iPinYou RTB](https://contest.ipinyou.com/) | Competitor bid distributions (Lognormal/hour) |
 | [Vogue Dialogue](https://github.com/aimagelab/Vogue-Dialogue) | User persona bank |
+| [MS-COCO Captions 2017](https://cocodataset.org/) | Ad + caption pool for `hard_assembly` |
+| [Google Trends](https://github.com/GeneralMills/pytrends) / [Reddit](https://www.reddit.com/) | Live viral hashtag scraping |
 All datasets are **optional** — the environment falls back to published
 statistics so it runs out-of-the-box with zero downloads.
     bid_price: float          # USD bid for the RTB auction (≥ 0)
     headline_id: int          # Index into the 6-slot headlines catalog (0–5)
     creative_id: int          # Index into the 6-slot creatives catalog (0–5)
+    generated_caption: str | None    # [hard_assembly] Rewritten caption with viral hashtags
+    generated_hashtags: list[str] | None  # [hard_assembly] Chosen hashtags (e.g. ["#QuietLuxury", "#OOTD"])
 ```
 ## Observation Space
     carryover_boost: float    # Brand-recall CTR boost [0, 0.30]
     last_ctr: float           # Previous step CTR
     cumulative_revenue: float # Total revenue earned
+    # hard_assembly only:
+    live_hashtags: list[str]      # Real-time scraped viral hashtags
+    image_description: str        # Source ad image description
+    base_caption: str             # Base caption to rewrite
 ```
 ## Reward Signal
 | Auction **won** | `adjusted_ctr × $15 − clearing_price` |
 | Auction **lost** | `−$0.10` (missed opportunity) |
 | Over-pacing (medium only) | `−$1.00` penalty |
+| Assembly bonus (hard_assembly) | `+composite_score × $8.00` |
 Rewards are **per-step** (not sparse), providing continuous gradient signal.
 **Objective:** Pace $50 across 24 hours; retain ≥ 20% for peak hours (18–22).
 **Budget:** $50 | **Grader:** `0.3×smoothness + 0.3×peak_survival + 0.4×revenue` | **Target:** 0.70
+### Level 3 — `hard_assembly` (Hard) 🔥
+**Objective:** Given an ad image description + base caption + live viral hashtags,
+**generate a new caption** that is simultaneously viral, coherent with the image,
+and creatively novel — while also winning auctions profitably.
+**Budget:** $120 | **Target:** 0.65
+**The RL loop (what the LLM agent does each step):**
+```
+1. Agent receives: image_description, base_caption, live_hashtags[], viral_trend
+2. Agent must:
+   a. Select 2–4 relevant hashtags from live_hashtags (scraped from Google Trends / Reddit)
+   b. Rewrite the base caption to weave those hashtags into natural ad copy
+   c. Add its own creative words (target 30–50% novel vocabulary)
+   d. Keep the caption coherent with the source image
+   e. Set a profitable bid price
+3. Grader scores the assembled caption on 4 axes:
+   • 35% — Hashtag relevance  (cosine_sim of each hashtag vs viral_trend)
+   • 35% — Caption-trend alignment  (cosine_sim of caption vs viral_trend)
+   • 20% — Caption-image coherence  (cosine_sim of caption vs image_description)
+   • 10% — Novelty  (fraction of new words vs base_caption, target ~40%)
+4. Reward = auction_reward + composite_score × $8.00 bonus
+```
+**Data sources for hard_assembly:**
+- **Ad creatives**: MS-COCO Captions 2017 (val annotations) bucketed into Fitness/Tech/Fashion/Gaming by keyword matching. Falls back to 30-entry built-in seed pool.
+- **Viral hashtags**: `ViralHashtagScraper` queries Google Trends (via `pytrends`) and Reddit `/r/popular/hot.json` (public, no auth). Blends with static seed hashtags per context and trend. Cached for 1 hour.
 ### Level 4 — `hard_sequencing` (Hard)
 **Objective:** Plan 24-hour ad placements with carry-over brand-recall boosts.
 ---
+## Grading Details
+### `EasyHeadlineGrader`
+```
+step_score  = CTR_selected / CTR_oracle
+final_score = mean(step_scores)                         // [0.0, 1.0]
+```
+### `MediumPacingGrader`
+```
+smoothness     = 1 − mean(|hourly_spend − ideal_spend| / ideal_spend)
+peak_survival  = 1.0 if remaining_budget ≥ 20% at hour 18, else 0.0
+revenue_factor = min(1.0, total_revenue / $30)
+final_score = 0.30 × smoothness + 0.30 × peak_survival + 0.40 × revenue_factor
+```
+### `HardAssemblyGrader` — 4-Axis Composite
+| Axis | Weight | Metric |
+|------|--------|--------|
+| Hashtag Relevance | 0.35 | `mean(cosine_sim(hashtag, viral_trend))` |
+| Caption-Trend Alignment | 0.35 | `cosine_sim(caption, viral_trend)` |
+| Caption-Image Coherence | 0.20 | `cosine_sim(caption, image_description)` |
+| Novelty | 0.10 | `1 − |novel_fraction − 0.40| / 0.60` |
+```
+composite = Σ (weight × axis_score)
+final_score = 0.60 × mean(composite_scores)
+            + 0.40 × min(1.0, total_revenue / $55)
+```
+### `HardSequencingGrader`
+```
+agent_conversions  = Σ [CTR_t × (1 + carryover_boost_t) × $15]
+oracle_conversions = DP-optimal bid/skip sequence with carry-over
+diversity_mult = 1.20 if ≥3 distinct contexts won, else 1.0
+final_score = min(1.0, agent_conv / oracle_conv × diversity_mult)
+```
+---
+## Architecture
+```
+┌───────────────────────────────────────────────────────────┐
+│  OpenEnvAuctioneer (Gym-style environment)                │
+│                                                           │
+│  ┌──────────────────┐   ┌───────────────────────────────┐ │
+│  │  Market Engine    │   │   User Simulator              │ │
+│  │  (Statistical)    │   │   (Semantic / LLM)            │ │
+│  │                   │   │                               │ │
+│  │  iPinYou RTB logs │   │  SentenceTransformer          │ │
+│  │  → Lognormal per  │   │  all-MiniLM-L6-v2            │ │
+│  │    hour bucket    │   │  + optional Llama-3-8B        │ │
+│  └──────────────────┘   └───────────────────────────────┘ │
+│                                                           │
+│  ┌───────────────────────────────────────────────────────┐ │
+│  │  MIND Dataset Layer  (Microsoft News Dataset)         │ │
+│  │  behaviours.tsv  →  CTRCalibrator                     │ │
+│  │  news.tsv        →  MINDCreativePool (headlines)      │ │
+│  └───────────────────────────────────────────────────────┘ │
+│                                                           │
+│  ┌───────────────────────────────────────────────────────┐ │
+│  │  Ad + Caption Dataset  (MS-COCO Captions 2017)        │ │
+│  │  → image_description + base_caption per step          │ │
+│  │  → ViralHashtagScraper (pytrends + Reddit + seeds)    │ │
+│  │  → agent rewrites caption with viral hashtags         │ │
+│  └───────────────────────────────────────────────────────┘ │
+│                                                           │
+│  ┌───────────────────────────────────────────────────────┐ │
+│  │  Grader (task-specific, deterministic 0.0–1.0)        │ │
+│  │   Level 1: easy_headline  → headline CTR lookup       │ │
+│  │   Level 2: medium_pacing  → pacing + survival         │ │
+│  │   Level 3: hard_assembly  → 4-axis composite score    │ │
+│  │   Level 4: hard_sequencing→ DP oracle comparison      │ │
+│  └───────────────────────────────────────────────────────┘ │
+└───────────────────────────────────────────────────────────┘
+```
+---
+## Models
+| Model | Role | Always Active? |
+|-------|------|----------------|
+| `all-MiniLM-L6-v2` (SentenceTransformer) | Semantic CTR scoring + grader cosine similarity | ✅ Yes |
+| `Meta-Llama-3-8B-Instruct` (4-bit) | Richer LLM-based CTR scoring | ❌ Optional (`USE_LLM_SIMULATOR=1`) |
+When the LLM simulator is active: `final_ctr = 0.60 × llm_ctr + 0.40 × semantic_ctr`
+---
 ## Setup & Usage
 ### Prerequisites
 | `LOCAL_IMAGE_NAME` | Yes (inference) | Docker image name |
 | `AUCTIONEER_TASK` | No | Task to run (default: `all`) |
 | `MIND_SOURCE` | No | `local` / `huggingface` / `azure` |
+| `COCO_SOURCE` | No | `local` / `url` (auto-download COCO annotations) |
 | `USE_LLM_SIMULATOR` | No | Set `1` to enable Llama-3 User Simulator |
 ---
 |------|---------------|-------|
 | `easy_headline` | 0.55 – 0.80 | Context→headline matching is learnable |
 | `medium_pacing` | 0.45 – 0.70 | Requires budget discipline |
+| `hard_assembly` | 0.40 – 0.65 | Caption quality + hashtag matching + auction wins |
 | `hard_sequencing` | 0.35 – 0.60 | Compared against DP oracle |
 Scores depend on LLM quality and market stochasticity.  Run multiple episodes
 ```
 ├── models.py          # Pydantic models: Action, Observation, Reward, Info
 ├── environment.py     # OpenEnvAuctioneer + graders + dataset layers
+│   ├── MINDLoader          # MIND dataset loader (HF / Azure / local)
+│   ├── MarketCalibrator    # iPinYou-based auction price simulator
+│   ├── CTRCalibrator       # MIND-based CTR lookup tables
+│   ├── MINDCreativePool    # 6-slot headline/creative catalog from news.tsv
+│   ├── PersonaBank         # Vogue Dialogue persona sampling
+│   ├── ViralHashtagScraper # Live hashtag scraping (pytrends + Reddit)
+│   ├── AdCaptionDataset    # COCO-based ad image+caption pool
+│   ├── UserSimulator       # Semantic + optional LLM CTR scoring
+│   ├── EasyHeadlineGrader  # Level 1 grader
+│   ├── MediumPacingGrader  # Level 2 grader
+│   ├── HardAssemblyGrader  # Level 3 grader (4-axis composite)
+│   ├── HardSequencingGrader# Level 4 grader (DP oracle)
+│   └── OpenEnvAuctioneer   # Main Gym-style env class
 ├── app.py             # FastAPI server (runs inside Docker)
 ├── inference.py       # Baseline inference script (mandatory format)
 ├── openenv.yaml       # OpenEnv metadata & task definitions
 └── Datasets/          # Optional dataset mount point
 ```
+## References
+1. **MIND**: Wu et al. (2020) — *"MIND: A Large-scale Dataset for News Recommendation"*, ACL 2020. [msnews.github.io](https://msnews.github.io/)
+2. **iPinYou RTB**: Zhang et al. (2014) — *"Real-Time Bidding Benchmarking with iPinYou Dataset"*. [contest.ipinyou.com](https://contest.ipinyou.com/)
+3. **MS-COCO Captions**: Lin et al. (2014) — *"Microsoft COCO: Common Objects in Context"*. [cocodataset.org](https://cocodataset.org/)
+4. **SentenceTransformers**: Reimers & Gurevych (2019) — *"Sentence-BERT"*. [sbert.net](https://www.sbert.net/)
 ## License
 MIT

app.py CHANGED Viewed

@@ -8,7 +8,7 @@ Runs inside the Docker container and exposes HTTP endpoints:
   GET  /health                       → liveness check
 """
-from typing import Optional
 from fastapi import FastAPI, Query
 from pydantic import BaseModel
@@ -16,7 +16,7 @@ from pydantic import BaseModel
 from environment import OpenEnvAuctioneer
 from models import Action
-app = FastAPI(title="OpenEnv Creative Auctioneer", version="0.3.0")
 # ---------------------------------------------------------------------------
 # Global environment instance (one per container)
@@ -33,6 +33,7 @@ class StepRequest(BaseModel):
     headline_id: int
     creative_id: int
     generated_caption: Optional[str] = None
 class ResetResponse(BaseModel):
@@ -75,6 +76,7 @@ def step_env(action: StepRequest):
         headline_id=action.headline_id,
         creative_id=action.creative_id,
         generated_caption=action.generated_caption,
     )
     obs, reward, done, info = _env.step(act)
     return StepResponse(

   GET  /health                       → liveness check
 """
+from typing import List, Optional
 from fastapi import FastAPI, Query
 from pydantic import BaseModel
 from environment import OpenEnvAuctioneer
 from models import Action
+app = FastAPI(title="OpenEnv Creative Auctioneer", version="0.4.0")
 # ---------------------------------------------------------------------------
 # Global environment instance (one per container)
     headline_id: int
     creative_id: int
     generated_caption: Optional[str] = None
+    generated_hashtags: Optional[List[str]] = None
 class ResetResponse(BaseModel):
         headline_id=action.headline_id,
         creative_id=action.creative_id,
         generated_caption=action.generated_caption,
+        generated_hashtags=action.generated_hashtags,
     )
     obs, reward, done, info = _env.step(act)
     return StepResponse(

inference.py CHANGED Viewed

@@ -73,9 +73,27 @@ SYSTEM_PROMPTS: Dict[str, str] = {
         If budget < $5 before hour 18, bid $0."""),
     "hard_assembly": textwrap.dedent("""\
-        You optimise for VIRAL TREND ALIGNMENT.  Generate a short caption
-        (<=12 words) that aligns with the viral trend AND context.
-        Score = 60% cosine-similarity + 40% revenue.  Bid $0.60-$1.50."""),
     "hard_sequencing": textwrap.dedent("""\
         You focus on CROSS-CONTEXT CAMPAIGN SEQUENCING.
@@ -184,12 +202,38 @@ def build_user_prompt(task_id: str, obs: dict) -> str:
     ]
     if task_id == "hard_sequencing":
         lines.append(f"Carryover boost: {obs.get('carryover_boost', 0):.2f}")
-    lines.append(CATALOG_CTX)
-    schema = '{"bid_price": <float>, "headline_id": <int 0-5>, "creative_id": <int 0-5>'
     if task_id == "hard_assembly":
-        schema += ', "generated_caption": "<caption>"'
-    schema += "}"
-    lines.append(f"Respond ONLY with JSON: {schema}")
     return "\n".join(lines)
@@ -239,6 +283,7 @@ async def run_task(task_id: str, image_name: str) -> float:
                 headline_id=int(action_data.get("headline_id", 0)),
                 creative_id=int(action_data.get("creative_id", 0)),
                 generated_caption=action_data.get("generated_caption"),
             )
             result = await env.step(action)
@@ -251,6 +296,8 @@ async def run_task(task_id: str, image_name: str) -> float:
             act_str = f"bid({action.bid_price:.2f},h={action.headline_id},c={action.creative_id})"
             if action.generated_caption:
                 act_str += f",cap={action.generated_caption[:25]}"
             log_step(step=step, action=act_str, reward=reward,
                      done=result.done, error=None)
@@ -302,4 +349,4 @@ async def main() -> None:
 if __name__ == "__main__":
-    asyncio.run(main())

         If budget < $5 before hour 18, bid $0."""),
     "hard_assembly": textwrap.dedent("""\
+        You are an AI Account Manager and Creative Director for the hard_assembly task.
+        YOUR JOB each step:
+        1. You receive a SOURCE AD CREATIVE: an image description + a base caption.
+        2. You receive LIVE VIRAL HASHTAGS scraped from Google Trends / Reddit.
+        3. You receive the current VIRAL TREND token (cultural keyword).
+        4. You must ASSEMBLE a final ad by:
+           (a) Selecting 2–4 hashtags from the live list that best match the trend.
+           (b) Rewriting the base caption to weave those hashtags into natural, punchy
+               ad copy — DO NOT just append hashtags at the end.  Blend them into prose.
+           (c) Adding your own creative words (target 30–50% new vocabulary).
+           (d) The final caption must stay coherent with the image description.
+        GRADER weights (what earns you points):
+           35% — Hashtag relevance:  chosen hashtags semantically match viral_trend
+           35% — Caption-trend align: your caption text matches viral_trend vocabulary
+           20% — Image coherence:    your caption stays faithful to the image
+           10% — Novelty:             you added real creative words, not just copy-paste
+        REWARD:  auction_base + (composite_score × $8 bonus per winning step)
+        BUDGET:  $120 for 24 hours.  Bid $0.60–$1.50 per step."""),
     "hard_sequencing": textwrap.dedent("""\
         You focus on CROSS-CONTEXT CAMPAIGN SEQUENCING.
     ]
     if task_id == "hard_sequencing":
         lines.append(f"Carryover boost: {obs.get('carryover_boost', 0):.2f}")
     if task_id == "hard_assembly":
+        # Show source creative and live hashtags
+        img_desc   = obs.get("image_description", "")
+        base_cap   = obs.get("base_caption", "")
+        live_tags  = obs.get("live_hashtags", [])
+        hashtag_list = "  ".join(live_tags) if live_tags else "(none scraped)"
+        lines.append("")
+        lines.append(f"━━━━━ SOURCE CREATIVE ━━━━━")
+        lines.append(f"Image description : {img_desc}")
+        lines.append(f"Base caption      : {base_cap}")
+        lines.append(f"")
+        lines.append(f"━━━━━ LIVE VIRAL HASHTAGS (scraped now) ━━━━━")
+        lines.append(f"  {hashtag_list}")
+        lines.append(f"")
+        lines.append(f"━━━━━ TASK ━━━━━")
+        lines.append(f"Select 2–4 hashtags from the list above that best match "
+                     f"the viral trend '{obs['viral_trend']}'.")
+        lines.append(f"Rewrite the base caption to weave them in naturally.")
+        lines.append(f"Stay coherent with the image. Add your own creative words.")
+        lines.append("")
+        schema = ('Respond ONLY with JSON:\n'
+                  '{"bid_price": <float>, "headline_id": <int 0-5>, "creative_id": <int 0-5>, '
+                  '"generated_caption": "<your caption>", '
+                  '"generated_hashtags": ["#Tag1", "#Tag2", ...]}')
+    else:
+        lines.append(CATALOG_CTX)
+        schema = '{"bid_price": <float>, "headline_id": <int 0-5>, "creative_id": <int 0-5>}'
+        if task_id != "hard_assembly":
+            schema = f"Respond ONLY with JSON: {schema}"
+    lines.append(schema)
     return "\n".join(lines)
                 headline_id=int(action_data.get("headline_id", 0)),
                 creative_id=int(action_data.get("creative_id", 0)),
                 generated_caption=action_data.get("generated_caption"),
+                generated_hashtags=action_data.get("generated_hashtags"),
             )
             result = await env.step(action)
             act_str = f"bid({action.bid_price:.2f},h={action.headline_id},c={action.creative_id})"
             if action.generated_caption:
                 act_str += f",cap={action.generated_caption[:25]}"
+            if action.generated_hashtags:
+                act_str += f",tags={len(action.generated_hashtags)}"
             log_step(step=step, action=act_str, reward=reward,
                      done=result.done, error=None)
 if __name__ == "__main__":
+    asyncio.run(main())

models.py CHANGED Viewed

@@ -4,13 +4,15 @@ models.py — Typed data contracts for the OpenEnv Creative Auctioneer.
 All tensors / vectors are represented as plain Python types so the environment
 stays framework-agnostic (no hard dependency on PyTorch at this layer).
-Dataset provenance (v0.3):
   CTR calibration  → MIND (Microsoft News Dataset)  behaviours.tsv + news.tsv
   Market engine    → iPinYou Global RTB logs         (Lognormal per hour)
   Persona bank     → Vogue Dialogue Dataset
 """
-from typing import Optional
 from pydantic import BaseModel, Field
@@ -42,7 +44,7 @@ class Observation(BaseModel):
     # ── Contextual Signals (Privacy-Native — no user IDs) ──────────────────
     current_context: str = Field(...,
         description="Content category derived from MIND news.tsv taxonomy "
-                    "(e.g. 'sports', 'technology', 'lifestyle', 'entertainment').")
     news_category: str = Field(default="",
         description="Fine-grained MIND subcategory (e.g. 'nfl', 'gadgets'). "
                     "Provides richer signal than coarse context alone.")
@@ -50,6 +52,20 @@ class Observation(BaseModel):
         description="Current cultural viral token surfaced from Reels "
                     "(e.g. 'Quiet Luxury', 'Eco-Friendly', 'Cyberpunk', 'Minimalism').")
     # ── Market Signals ─────────────────────────────────────────────────────
     market_pressure: float = Field(default=0.5, ge=0.0, le=1.0,
         description="Normalised indicator of how competitive the auction is "
@@ -60,10 +76,10 @@ class Observation(BaseModel):
         description="Number of ads already shown; drives the fatigue penalty.")
     fatigue_level: float = Field(default=0.0, ge=0.0, le=1.0,
         description="Accumulated user-fatigue penalty (0 = fresh, 1 = fully fatigued).")
     # ── Performance Feedback (delayed by 1 step) ───────────────────────────
-    carryover_boost: float = Field(default=0.0, ge=0.0, le=0.30,
-        description="CTR boost from brand-recall carry-over (decaying from prior ad wins).")
     last_ctr: float = Field(default=0.0, ge=0.0, le=1.0,
         description="CTR returned by the User Simulator on the previous step.")
     cumulative_revenue: float = Field(default=0.0,
@@ -85,9 +101,18 @@ class Action(BaseModel):
     creative_id: int = Field(..., ge=0, le=5,
         description="Index into the Creatives Catalog (0–5).")
-    # Optional free-text caption (retained for backwards compatibility). No effect in hard_sequencing.
     generated_caption: Optional[str] = Field(default=None,
-        description="Free-form caption generated by the agent in old hard_assembly mode.")
 # ---------------------------------------------------------------------------
@@ -111,16 +136,32 @@ class Info(BaseModel):
     task_score: float = Field(..., ge=0.0, le=1.0,
         description="Final 0.0–1.0 task-completion score.")
-    # Level-specific sub-scores (populated per task_id)
     headline_alignment_score: float = Field(default=0.0, ge=0.0, le=1.0,
         description="[easy_headline] CTR_selected / CTR_best for this context.")
     pacing_score: float = Field(default=0.0, ge=0.0, le=1.0,
         description="[medium_pacing] Budget-smoothness and peak-hour survival bonus.")
     clip_similarity_score: float = Field(default=0.0, ge=0.0, le=1.0,
-        description="[hard_assembly] Cosine similarity between caption and viral token.")
     sequencing_score: float = Field(default=0.0, ge=0.0, le=1.0,
-        description="[hard_sequencing] Agent conversions / Oracle conversions × diversity bonus.")
-    contexts_covered: int = Field(default=0, ge=0, le=4,
-        description="[hard_sequencing] Number of distinct contexts that received ≥1 ad.")
     diversity_multiplier: float = Field(default=1.0,
-        description="[hard_sequencing] 1.2 if ≥3 contexts covered, else 1.0.")

 All tensors / vectors are represented as plain Python types so the environment
 stays framework-agnostic (no hard dependency on PyTorch at this layer).
+Dataset provenance (v0.4):
   CTR calibration  → MIND (Microsoft News Dataset)  behaviours.tsv + news.tsv
   Market engine    → iPinYou Global RTB logs         (Lognormal per hour)
   Persona bank     → Vogue Dialogue Dataset
+  Ad+Caption pool  → MS-COCO Captions  OR  Google Conceptual Captions CC3M
+  Viral hashtags   → Pytrends / Hashtagify / static fallback table
 """
+from typing import List, Optional
 from pydantic import BaseModel, Field
     # ── Contextual Signals (Privacy-Native — no user IDs) ──────────────────
     current_context: str = Field(...,
         description="Content category derived from MIND news.tsv taxonomy "
+                    "(e.g. 'Fitness', 'Tech', 'Fashion', 'Gaming').")
     news_category: str = Field(default="",
         description="Fine-grained MIND subcategory (e.g. 'nfl', 'gadgets'). "
                     "Provides richer signal than coarse context alone.")
         description="Current cultural viral token surfaced from Reels "
                     "(e.g. 'Quiet Luxury', 'Eco-Friendly', 'Cyberpunk', 'Minimalism').")
+    # ── hard_assembly: live scraped hashtags + source creative ─────────────
+    live_hashtags: List[str] = Field(default_factory=list,
+        description="[hard_assembly] Real-time scraped viral hashtags from "
+                    "Google Trends / Reddit.  The agent selects which to use "
+                    "and weaves them into generated_caption. "
+                    "Example: ['#QuietLuxury', '#OOTD', '#SlowFashion'].")
+    image_description: str = Field(default="",
+        description="[hard_assembly] Text description of the source ad image "
+                    "from AdCaptionDataset (COCO or seed pool). "
+                    "Agent caption must stay coherent with this.")
+    base_caption: str = Field(default="",
+        description="[hard_assembly] Base caption from AdCaptionDataset. "
+                    "Agent rewrites this to incorporate viral hashtags.")
     # ── Market Signals ─────────────────────────────────────────────────────
     market_pressure: float = Field(default=0.5, ge=0.0, le=1.0,
         description="Normalised indicator of how competitive the auction is "
         description="Number of ads already shown; drives the fatigue penalty.")
     fatigue_level: float = Field(default=0.0, ge=0.0, le=1.0,
         description="Accumulated user-fatigue penalty (0 = fresh, 1 = fully fatigued).")
+    carryover_boost: float = Field(default=0.0, ge=0.0, le=1.0,
+        description="[hard_sequencing] Carry-over CTR boost from winning prior auctions.")
     # ── Performance Feedback (delayed by 1 step) ───────────────────────────
     last_ctr: float = Field(default=0.0, ge=0.0, le=1.0,
         description="CTR returned by the User Simulator on the previous step.")
     cumulative_revenue: float = Field(default=0.0,
     creative_id: int = Field(..., ge=0, le=5,
         description="Index into the Creatives Catalog (0–5).")
+    # ── hard_assembly fields ────────────────────────────────────────────────
     generated_caption: Optional[str] = Field(default=None,
+        description="[hard_assembly] Final assembled caption — should incorporate "
+                    "viral hashtags and remain coherent with the source image. "
+                    "Leave None for easy/medium tasks.")
+    generated_hashtags: Optional[List[str]] = Field(default=None,
+        description="[hard_assembly] List of hashtag strings (with #) that the agent "
+                    "chose to include. The agent must scrape these from ViralHashtagScraper "
+                    "and select which ones to weave into generated_caption. "
+                    "Example: ['#QuietLuxury', '#OOTD', '#SlowFashion']. "
+                    "Leave None for easy/medium/sequencing tasks.")
 # ---------------------------------------------------------------------------
     task_score: float = Field(..., ge=0.0, le=1.0,
         description="Final 0.0–1.0 task-completion score.")
+    # Level 1 sub-score
     headline_alignment_score: float = Field(default=0.0, ge=0.0, le=1.0,
         description="[easy_headline] CTR_selected / CTR_best for this context.")
+    # Level 2 sub-score
     pacing_score: float = Field(default=0.0, ge=0.0, le=1.0,
         description="[medium_pacing] Budget-smoothness and peak-hour survival bonus.")
+    # Level 3 sub-scores (all three axes)
     clip_similarity_score: float = Field(default=0.0, ge=0.0, le=1.0,
+        description="[hard_assembly] Composite grader score (0.35×hashtag + 0.35×align + 0.30×coherence).")
+    hashtag_relevance_score: float = Field(default=0.0, ge=0.0, le=1.0,
+        description="[hard_assembly] Mean cosine_sim(chosen_hashtag, viral_trend).")
+    caption_trend_alignment: float = Field(default=0.0, ge=0.0, le=1.0,
+        description="[hard_assembly] cosine_sim(final_caption, viral_trend).")
+    caption_image_coherence: float = Field(default=0.0, ge=0.0, le=1.0,
+        description="[hard_assembly] cosine_sim(final_caption, image_description).")
+    chosen_hashtags: List[str] = Field(default_factory=list,
+        description="[hard_assembly] Hashtags the agent chose this step.")
+    assembly_reward_bonus: float = Field(default=0.0,
+        description="[hard_assembly] Extra reward granted for viral alignment quality.")
+    # Level 4 sub-scores
     sequencing_score: float = Field(default=0.0, ge=0.0, le=1.0,
+        description="[hard_sequencing] agent_conversions / oracle_conversions × diversity.")
+    contexts_covered: int = Field(default=0,
+        description="[hard_sequencing] Number of distinct contexts won at least once.")
     diversity_multiplier: float = Field(default=1.0,
+        description="[hard_sequencing] Bonus multiplier for covering ≥3 contexts.")

requirements.txt CHANGED Viewed

@@ -9,18 +9,28 @@ openai>=1.0.0
 sentence-transformers>=2.2.2
 torch>=2.0.0
-# MIND dataset — Option A (HuggingFace streaming, no local disk)
-# Standard library urllib is used for direct TSV download (no extra deps needed).
-# Uncomment below ONLY if you want the full HuggingFace datasets library instead:
 # datasets>=2.18.0
 # Optional: LLM-based User Simulator (activate with USE_LLM_SIMULATOR=1)
-# Uncomment if running on GPU with ≥16 GB VRAM
 # transformers>=4.40.0
 # bitsandbytes>=0.43.0
 # accelerate>=0.29.0
-# Optional: true CLIP scoring for hard_assembly grader
 # open-clip-torch>=2.24.0
 # Serving / inference client

 sentence-transformers>=2.2.2
 torch>=2.0.0
+# ── hard_assembly: ViralHashtagScraper ──────────────────────────────────────
+# Source 1: Google Trends (free, no API key)
+pytrends>=4.9.2
+# Source 2: Reddit public REST API — uses stdlib urllib (no extra dep needed)
+# ── hard_assembly: AdCaptionDataset ─────────────────────────────────────────
+# MS-COCO Captions 2017 val annotations (~241 MB)
+# Set COCO_SOURCE=url to auto-download on first run (uses stdlib urllib + zipfile)
+# Set COCO_SOURCE=local (default) + place captions_val2017.json at Datasets/coco_captions/
+# ── MIND dataset — Option A HuggingFace (zero local disk) ───────────────────
+# Uses stdlib urllib for direct TSV download (no extra deps needed)
+# Uncomment below if you prefer the full HuggingFace datasets library:
 # datasets>=2.18.0
 # Optional: LLM-based User Simulator (activate with USE_LLM_SIMULATOR=1)
+# Requires GPU with ≥16 GB VRAM
 # transformers>=4.40.0
 # bitsandbytes>=0.43.0
 # accelerate>=0.29.0
+# Optional: true CLIP image+text scoring for hard_assembly grader
 # open-clip-torch>=2.24.0
 # Serving / inference client