Spaces:
Sleeping
Sleeping
Commit ·
f69e608
0
Parent(s):
feat: Complete RetailMind overhaul — hybrid retrieval, EWMA drift detection, self-healing adapter, premium UI, tests & CI
Browse files- Rewrote catalog: 200 curated products with unique descriptions, materials, ratings, tags
- Hybrid retrieval: price-parsing + category detection + semantic re-ranking
- EWMA drift detector: smoothed concept tracking with multiple anchor phrases
- Rich adaptation rules with detailed self-healing explanations
- LLM: anti-hallucination prompt engineering with structured context injection
- Premium Gradio UI: aurora header, info callouts, score badges, star ratings
- Added pytest suite (catalog, drift, retrieval, adaptation)
- Added GitHub Actions CI (lint + test on Python 3.10-3.12)
- Recruiter-grade README with architecture diagram, technical decisions, demo walkthrough
- Security: .gitignore for secrets, .env.example for onboarding
- .env.example +6 -0
- .github/workflows/ci.yml +48 -0
- .gitignore +40 -0
- README.md +197 -0
- app.py +468 -0
- modules/__init__.py +1 -0
- modules/adaptation.py +141 -0
- modules/data_simulation.py +318 -0
- modules/drift.py +153 -0
- modules/llm.py +95 -0
- modules/retrieval.py +150 -0
- requirements.txt +8 -0
- tests/__init__.py +1 -0
- tests/test_adaptation.py +51 -0
- tests/test_catalog.py +50 -0
- tests/test_drift.py +59 -0
- tests/test_retrieval.py +60 -0
.env.example
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ── Environment Variables ──────────────────────────────────────
|
| 2 |
+
# Copy this file to `.env` and fill in your values.
|
| 3 |
+
|
| 4 |
+
# (Optional) Hugging Face API token — only needed if using gated models.
|
| 5 |
+
# The default model (Qwen2.5-0.5B-Instruct) does NOT require a token.
|
| 6 |
+
HF_TOKEN=hf_your_token_here
|
.github/workflows/ci.yml
ADDED
|
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
name: CI
|
| 2 |
+
|
| 3 |
+
on:
|
| 4 |
+
push:
|
| 5 |
+
branches: [main]
|
| 6 |
+
pull_request:
|
| 7 |
+
branches: [main]
|
| 8 |
+
|
| 9 |
+
jobs:
|
| 10 |
+
test:
|
| 11 |
+
runs-on: ubuntu-latest
|
| 12 |
+
strategy:
|
| 13 |
+
matrix:
|
| 14 |
+
python-version: ["3.10", "3.11", "3.12"]
|
| 15 |
+
|
| 16 |
+
steps:
|
| 17 |
+
- uses: actions/checkout@v4
|
| 18 |
+
|
| 19 |
+
- name: Set up Python ${{ matrix.python-version }}
|
| 20 |
+
uses: actions/setup-python@v5
|
| 21 |
+
with:
|
| 22 |
+
python-version: ${{ matrix.python-version }}
|
| 23 |
+
cache: "pip"
|
| 24 |
+
|
| 25 |
+
- name: Install dependencies
|
| 26 |
+
run: |
|
| 27 |
+
python -m pip install --upgrade pip
|
| 28 |
+
pip install -r requirements.txt
|
| 29 |
+
pip install pytest
|
| 30 |
+
|
| 31 |
+
- name: Run tests
|
| 32 |
+
run: pytest tests/ -v --tb=short
|
| 33 |
+
|
| 34 |
+
lint:
|
| 35 |
+
runs-on: ubuntu-latest
|
| 36 |
+
steps:
|
| 37 |
+
- uses: actions/checkout@v4
|
| 38 |
+
|
| 39 |
+
- name: Set up Python
|
| 40 |
+
uses: actions/setup-python@v5
|
| 41 |
+
with:
|
| 42 |
+
python-version: "3.11"
|
| 43 |
+
|
| 44 |
+
- name: Install ruff
|
| 45 |
+
run: pip install ruff
|
| 46 |
+
|
| 47 |
+
- name: Lint
|
| 48 |
+
run: ruff check . --select E,F,W --ignore E501
|
.gitignore
ADDED
|
@@ -0,0 +1,40 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ── Secrets & tokens ───────────────────────────────────────────
|
| 2 |
+
hf_token
|
| 3 |
+
gh_token
|
| 4 |
+
.env
|
| 5 |
+
*.key
|
| 6 |
+
|
| 7 |
+
# ── Python ─────────────────────────────────────────────────────
|
| 8 |
+
__pycache__/
|
| 9 |
+
*.py[cod]
|
| 10 |
+
*$py.class
|
| 11 |
+
*.egg-info/
|
| 12 |
+
dist/
|
| 13 |
+
build/
|
| 14 |
+
*.egg
|
| 15 |
+
|
| 16 |
+
# ── Virtual environments ──────────────────────────────────────
|
| 17 |
+
venv/
|
| 18 |
+
.venv/
|
| 19 |
+
env/
|
| 20 |
+
|
| 21 |
+
# ── IDE / Editor ──────────────────────────────────────────────
|
| 22 |
+
.vscode/
|
| 23 |
+
.idea/
|
| 24 |
+
*.swp
|
| 25 |
+
*.swo
|
| 26 |
+
*~
|
| 27 |
+
|
| 28 |
+
# ── Gradio ────────────────────────────────────────────────────
|
| 29 |
+
.gradio/
|
| 30 |
+
flagged/
|
| 31 |
+
|
| 32 |
+
# ── OS ────────────────────────────────────────────────────────
|
| 33 |
+
.DS_Store
|
| 34 |
+
Thumbs.db
|
| 35 |
+
|
| 36 |
+
# ── Models (don't push multi-GB weights) ─────────────────────
|
| 37 |
+
*.bin
|
| 38 |
+
*.safetensors
|
| 39 |
+
*.gguf
|
| 40 |
+
models/
|
README.md
ADDED
|
@@ -0,0 +1,197 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<div align="center">
|
| 2 |
+
|
| 3 |
+
# 🧠 RetailMind
|
| 4 |
+
|
| 5 |
+
### Self-Healing LLM for Store Intelligence
|
| 6 |
+
|
| 7 |
+
[](https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence/actions)
|
| 8 |
+
[](https://python.org)
|
| 9 |
+
[](https://gradio.app)
|
| 10 |
+
[](LICENSE)
|
| 11 |
+
|
| 12 |
+
**An autonomous e-commerce AI that detects semantic drift in user intent and self-heals its own behavior in real time — no human in the loop.**
|
| 13 |
+
|
| 14 |
+
[Live Demo](#-quick-start) · [Architecture](#-architecture) · [How It Works](#-how-the-self-healing-loop-works) · [Technical Decisions](#-technical-decisions)
|
| 15 |
+
|
| 16 |
+
</div>
|
| 17 |
+
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
## 🎯 What This Project Demonstrates
|
| 21 |
+
|
| 22 |
+
| Skill | Implementation |
|
| 23 |
+
|-------|---------------|
|
| 24 |
+
| **MLOps / Observability** | Real-time EWMA-based drift detection with live telemetry dashboard |
|
| 25 |
+
| **RAG / Information Retrieval** | Hybrid retrieval: metadata pre-filtering (price, category) + dense semantic re-ranking |
|
| 26 |
+
| **Prompt Engineering** | Anti-hallucination grounding, dynamic prompt injection based on detected drift |
|
| 27 |
+
| **Self-Healing Systems** | Autonomous prompt rewriting when intent distribution shifts — zero human intervention |
|
| 28 |
+
| **LLM Integration** | Local Qwen2.5-0.5B inference on CPU — no API keys, no GPU, fully offline-capable |
|
| 29 |
+
| **Software Engineering** | Type hints, docstrings, logging, pytest suite, CI/CD, modular architecture |
|
| 30 |
+
|
| 31 |
+
---
|
| 32 |
+
|
| 33 |
+
## ⚡ Architecture
|
| 34 |
+
|
| 35 |
+
```mermaid
|
| 36 |
+
graph LR
|
| 37 |
+
A["🛒 User Query"] --> B["📊 Drift Detector<br/><i>EWMA Semantic Analysis</i>"]
|
| 38 |
+
A --> C["🔍 Hybrid Retriever<br/><i>Price Filter + Dense Search</i>"]
|
| 39 |
+
B --> D["🔧 Self-Healing Adapter<br/><i>Dynamic Prompt Mutation</i>"]
|
| 40 |
+
C --> E["🤖 Local LLM<br/><i>Qwen2.5-0.5B · CPU</i>"]
|
| 41 |
+
D --> E
|
| 42 |
+
E --> F["💬 Grounded Response"]
|
| 43 |
+
B --> G["📈 Telemetry Dashboard<br/><i>Live EWMA Charts</i>"]
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
### Module Breakdown
|
| 47 |
+
|
| 48 |
+
```
|
| 49 |
+
RetailMind/
|
| 50 |
+
├── app.py # Gradio UI — 3-panel dashboard
|
| 51 |
+
├── modules/
|
| 52 |
+
│ ├── data_simulation.py # 200 curated products with rich metadata
|
| 53 |
+
│ ├── retrieval.py # Hybrid retriever (price-filter → semantic re-rank)
|
| 54 |
+
│ ├── drift.py # EWMA-based semantic drift detector
|
| 55 |
+
│ ├── adaptation.py # Self-healing prompt adapter
|
| 56 |
+
│ └── llm.py # Local Qwen2.5-0.5B inference engine
|
| 57 |
+
├── tests/ # pytest suite (catalog, retrieval, drift, adaptation)
|
| 58 |
+
├── .github/workflows/ci.yml # CI pipeline (lint + test on Python 3.10–3.12)
|
| 59 |
+
└── requirements.txt
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
---
|
| 63 |
+
|
| 64 |
+
## 🔄 How the Self-Healing Loop Works
|
| 65 |
+
|
| 66 |
+
The system continuously monitors the **semantic similarity** between incoming queries and predefined concept anchors using an **Exponentially Weighted Moving Average (EWMA)**.
|
| 67 |
+
|
| 68 |
+
```
|
| 69 |
+
Normal Mode Drift Detected!
|
| 70 |
+
┌──────────┐ ┌──────────────┐
|
| 71 |
+
User asks about │ Balanced │ EWMA crosses 0.38 → │ Auto-Inject │
|
| 72 |
+
random products → │ Prompt │ ──────────────────────── │ New Rules │
|
| 73 |
+
└──────────┘ └──────────────┘
|
| 74 |
+
│
|
| 75 |
+
┌──────────┐ ▼
|
| 76 |
+
│ LLM now │ ◄─── Prompt mutated to prioritize
|
| 77 |
+
│ focuses │ price / season / sustainability
|
| 78 |
+
│ on drift │ based on detected pattern
|
| 79 |
+
└──────────┘
|
| 80 |
+
```
|
| 81 |
+
|
| 82 |
+
### Concept Anchors
|
| 83 |
+
|
| 84 |
+
| Concept | Trigger Keywords | Adaptation |
|
| 85 |
+
|---------|-----------------|------------|
|
| 86 |
+
| 💰 **Price Sensitive** | cheap, budget, under $X, deal | Prioritize lowest-price items, highlight savings |
|
| 87 |
+
| ☀️ **Summer Shift** | beach, lightweight, UV, hot weather | Surface breathable/outdoor products, suppress winter |
|
| 88 |
+
| 🌿 **Eco Trend** | sustainable, recycled, organic, plant-based | Lead with eco-credentials, cite certifications |
|
| 89 |
+
|
| 90 |
+
**Key insight:** The system doesn't just match keywords — it uses **semantic similarity** via sentence embeddings. So even a query like *"I care about the planet"* (no eco keywords) will still trigger the eco adaptation because it's semantically close to the concept anchor.
|
| 91 |
+
|
| 92 |
+
---
|
| 93 |
+
|
| 94 |
+
## 🔍 Hybrid Retrieval Deep Dive
|
| 95 |
+
|
| 96 |
+
Traditional RAG uses pure semantic similarity, which fails on structured queries like *"bags under $25"*. RetailMind combines:
|
| 97 |
+
|
| 98 |
+
1. **Price Extraction** — Regex-based NLU parses price ceilings from natural language (`"under $50"`, `"budget of $30"`, `"cheapest"`)
|
| 99 |
+
2. **Category Detection** — Maps query terms to catalog categories (`"eco-friendly"` → eco, `"gym"` → sports)
|
| 100 |
+
3. **Pre-Filtering** — Removes products that violate hard constraints *before* embedding search
|
| 101 |
+
4. **Semantic Re-Ranking** — Cosine similarity on SentenceTransformer embeddings ranks survivors
|
| 102 |
+
|
| 103 |
+
```python
|
| 104 |
+
# Example: "eco-friendly bag under $30"
|
| 105 |
+
# Step 1: price_cap = 30.0
|
| 106 |
+
# Step 2: category = "eco-friendly"
|
| 107 |
+
# Step 3: 200 products → ~8 candidates (eco + under $30)
|
| 108 |
+
# Step 4: Rank 8 candidates by semantic similarity → top 4
|
| 109 |
+
```
|
| 110 |
+
|
| 111 |
+
---
|
| 112 |
+
|
| 113 |
+
## 🚀 Quick Start
|
| 114 |
+
|
| 115 |
+
### Prerequisites
|
| 116 |
+
- Python 3.10+
|
| 117 |
+
- ~2 GB disk space (for model weights on first run)
|
| 118 |
+
|
| 119 |
+
### Installation
|
| 120 |
+
|
| 121 |
+
```bash
|
| 122 |
+
git clone https://github.com/hodfa840/-RetailMind-Self-Healing-LLM-for-Store-Intelligence.git
|
| 123 |
+
cd -RetailMind-Self-Healing-LLM-for-Store-Intelligence
|
| 124 |
+
pip install -r requirements.txt
|
| 125 |
+
```
|
| 126 |
+
|
| 127 |
+
### Run
|
| 128 |
+
|
| 129 |
+
```bash
|
| 130 |
+
python app.py
|
| 131 |
+
```
|
| 132 |
+
|
| 133 |
+
The app launches at `http://localhost:7860` with a public share link.
|
| 134 |
+
|
| 135 |
+
### Run Tests
|
| 136 |
+
|
| 137 |
+
```bash
|
| 138 |
+
pip install pytest
|
| 139 |
+
pytest tests/ -v
|
| 140 |
+
```
|
| 141 |
+
|
| 142 |
+
---
|
| 143 |
+
|
| 144 |
+
## 🧪 Demo Walkthrough
|
| 145 |
+
|
| 146 |
+
To see the self-healing system in action:
|
| 147 |
+
|
| 148 |
+
1. **Phase 1 (Normal)** — Ask general product questions. The system responds in balanced mode.
|
| 149 |
+
2. **Phase 2 (Black Friday)** — Click budget-oriented queries. Watch the drift chart's gold line spike above the threshold. The system auto-injects price-prioritization rules.
|
| 150 |
+
3. **Phase 3 (Summer)** — Switch to summer queries. The cyan line rises, and the system pivots to warm-weather products — *without being told to*.
|
| 151 |
+
4. **Phase 4 (Eco)** — Ask about sustainability. The green line triggers, and the system starts citing certifications and materials.
|
| 152 |
+
|
| 153 |
+
> The telemetry panel on the right shows exactly what's happening under the hood — which drift was detected, what prompt rules were injected, and why.
|
| 154 |
+
|
| 155 |
+
---
|
| 156 |
+
|
| 157 |
+
## 🧭 Technical Decisions
|
| 158 |
+
|
| 159 |
+
| Decision | Rationale |
|
| 160 |
+
|----------|-----------|
|
| 161 |
+
| **Qwen2.5-0.5B on CPU** | Eliminates API dependency, runs on any machine, no token needed. Trades quality for reliability — acceptable since grounding handles accuracy. |
|
| 162 |
+
| **EWMA over raw scores** | Single-query similarity is noisy. EWMA smooths the signal so the system doesn't flip between modes on every query. α=0.35 balances reactivity with stability. |
|
| 163 |
+
| **Hybrid retrieval over pure semantic** | Semantic search alone can't handle price constraints. A $200 jacket and a $20 hat may both be semantically relevant to "winter gear under $25" — only the pre-filter catches this. |
|
| 164 |
+
| **SentenceTransformers (all-MiniLM-L6-v2)** | 80MB model, runs on CPU in <50ms per query. Good enough for 200-product catalog. Would swap to a larger model for production scale. |
|
| 165 |
+
| **200 curated products over 1,500 generated** | Quality embeddings require quality descriptions. 200 hand-authored products with unique specs outperform 1,500 template-generated items where retrieval can't distinguish between them. |
|
| 166 |
+
| **Prompt injection over fine-tuning** | Fine-tuning a 0.5B model per drift state is impractical. Dynamic prompt injection achieves the same behavioral shift with zero training cost and instant reversibility. |
|
| 167 |
+
|
| 168 |
+
---
|
| 169 |
+
|
| 170 |
+
## 🔮 Future Roadmap
|
| 171 |
+
|
| 172 |
+
- [ ] **Multi-turn memory** — Track user preferences across conversation turns
|
| 173 |
+
- [ ] **A/B testing framework** — Compare adapted vs. baseline responses
|
| 174 |
+
- [ ] **Drift alerting** — Webhook notifications when drift exceeds critical thresholds
|
| 175 |
+
- [ ] **Vector database** — Migrate from in-memory NumPy to FAISS/Qdrant for scale
|
| 176 |
+
- [ ] **User feedback loop** — Incorporate thumbs-up/down into drift calibration
|
| 177 |
+
|
| 178 |
+
---
|
| 179 |
+
|
| 180 |
+
## 🛠️ Tech Stack
|
| 181 |
+
|
| 182 |
+
| Component | Technology |
|
| 183 |
+
|-----------|-----------|
|
| 184 |
+
| UI Framework | Gradio 4.x |
|
| 185 |
+
| LLM | Qwen/Qwen2.5-0.5B-Instruct (local, CPU) |
|
| 186 |
+
| Embeddings | SentenceTransformers (all-MiniLM-L6-v2) |
|
| 187 |
+
| Retrieval | Hybrid (NumPy cosine + metadata pre-filter) |
|
| 188 |
+
| Charting | Plotly |
|
| 189 |
+
| Testing | pytest |
|
| 190 |
+
| CI/CD | GitHub Actions |
|
| 191 |
+
| Language | Python 3.10+ with type hints |
|
| 192 |
+
|
| 193 |
+
---
|
| 194 |
+
|
| 195 |
+
<div align="center">
|
| 196 |
+
<sub>Built by <a href="https://github.com/hodfa840">hodfa840</a> · Linköping University</sub>
|
| 197 |
+
</div>
|
app.py
ADDED
|
@@ -0,0 +1,468 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
RetailMind — Self-Healing LLM for Store Intelligence
|
| 3 |
+
|
| 4 |
+
Gradio application showcasing real-time semantic drift detection,
|
| 5 |
+
autonomous prompt adaptation, and hybrid RAG retrieval.
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
import logging
|
| 9 |
+
import gradio as gr
|
| 10 |
+
import plotly.graph_objects as go
|
| 11 |
+
from modules.data_simulation import generate_catalog, get_scenarios
|
| 12 |
+
from modules.retrieval import HybridRetriever
|
| 13 |
+
from modules.drift import DriftDetector
|
| 14 |
+
from modules.adaptation import Adapter
|
| 15 |
+
from modules.llm import generate_response
|
| 16 |
+
|
| 17 |
+
# ── Logging ────────────────────────────────────────────────────────────────
|
| 18 |
+
logging.basicConfig(
|
| 19 |
+
level=logging.INFO,
|
| 20 |
+
format="%(asctime)s │ %(name)-24s │ %(levelname)-5s │ %(message)s",
|
| 21 |
+
datefmt="%H:%M:%S",
|
| 22 |
+
)
|
| 23 |
+
logger = logging.getLogger("retailmind")
|
| 24 |
+
|
| 25 |
+
# ── Initialize components ─────────────────────────────────────────────────
|
| 26 |
+
logger.info("Bootstrapping RetailMind…")
|
| 27 |
+
catalog = generate_catalog()
|
| 28 |
+
retriever = HybridRetriever(catalog)
|
| 29 |
+
detector = DriftDetector()
|
| 30 |
+
adapter = Adapter()
|
| 31 |
+
scenarios = get_scenarios()
|
| 32 |
+
logger.info("Ready — %d products indexed.", len(catalog))
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
# ── Helper: Image mapping ─────────────────────────────────────────────────
|
| 36 |
+
IMAGE_MAP = {
|
| 37 |
+
"Parka": "https://images.unsplash.com/photo-1544923246-77307dd270b5?w=400&h=300&fit=crop",
|
| 38 |
+
"Sweater": "https://images.unsplash.com/photo-1610652492500-dea0624af6ee?w=400&h=300&fit=crop",
|
| 39 |
+
"Gloves": "https://images.unsplash.com/photo-1551538827-9c037cb4f32a?w=400&h=300&fit=crop",
|
| 40 |
+
"Boots": "https://images.unsplash.com/photo-1608256246200-53e635b5b65f?w=400&h=300&fit=crop",
|
| 41 |
+
"Beanie": "https://images.unsplash.com/photo-1576871337622-98d48d1cf531?w=400&h=300&fit=crop",
|
| 42 |
+
"Fleece": "https://images.unsplash.com/photo-1591047139829-d91aecb6caea?w=400&h=300&fit=crop",
|
| 43 |
+
"Base Layer": "https://images.unsplash.com/photo-1489987707025-afc232f7ea0f?w=400&h=300&fit=crop",
|
| 44 |
+
"Vest": "https://images.unsplash.com/photo-1591047139829-d91aecb6caea?w=400&h=300&fit=crop",
|
| 45 |
+
"Sneakers": "https://images.unsplash.com/photo-1542291026-7eec264c27ff?w=400&h=300&fit=crop",
|
| 46 |
+
"Shorts": "https://images.unsplash.com/photo-1591195853828-11db59a44f6b?w=400&h=300&fit=crop",
|
| 47 |
+
"Sunglasses": "https://images.unsplash.com/photo-1511499767150-a48a237f0083?w=400&h=300&fit=crop",
|
| 48 |
+
"Linen": "https://images.unsplash.com/photo-1596755094514-f87e34085b2c?w=400&h=300&fit=crop",
|
| 49 |
+
"Sandals": "https://images.unsplash.com/photo-1603487742131-4160ec999306?w=400&h=300&fit=crop",
|
| 50 |
+
"Tank": "https://images.unsplash.com/photo-1521572163474-6864f9cf17ab?w=400&h=300&fit=crop",
|
| 51 |
+
"Hat": "https://images.unsplash.com/photo-1521369909029-2afed882baee?w=400&h=300&fit=crop",
|
| 52 |
+
"Water Shoes": "https://images.unsplash.com/photo-1542291026-7eec264c27ff?w=400&h=300&fit=crop",
|
| 53 |
+
"Backpack": "https://images.unsplash.com/photo-1553062407-98eeb64c6a62?w=400&h=300&fit=crop",
|
| 54 |
+
"Bottle": "https://images.unsplash.com/photo-1602143407151-7111542de6e8?w=400&h=300&fit=crop",
|
| 55 |
+
"Tee": "https://images.unsplash.com/photo-1521572163474-6864f9cf17ab?w=400&h=300&fit=crop",
|
| 56 |
+
"Tote": "https://images.unsplash.com/photo-1622560480605-d83c853bc5c3?w=400&h=300&fit=crop",
|
| 57 |
+
"Shoes": "https://images.unsplash.com/photo-1542291026-7eec264c27ff?w=400&h=300&fit=crop",
|
| 58 |
+
"Jacket": "https://images.unsplash.com/photo-1551028719-00167b16eac5?w=400&h=300&fit=crop",
|
| 59 |
+
"Watch": "https://images.unsplash.com/photo-1523275335684-37898b6baf30?w=400&h=300&fit=crop",
|
| 60 |
+
"Mat": "https://images.unsplash.com/photo-1544367567-0f2fcb009e0b?w=400&h=300&fit=crop",
|
| 61 |
+
"Headphones": "https://images.unsplash.com/photo-1505740420928-5e560c06d30e?w=400&h=300&fit=crop",
|
| 62 |
+
"Tracker": "https://images.unsplash.com/photo-1557438159-51eec7a6c9e8?w=400&h=300&fit=crop",
|
| 63 |
+
"Earbuds": "https://images.unsplash.com/photo-1590658268037-6bf12f032f55?w=400&h=300&fit=crop",
|
| 64 |
+
"Charger": "https://images.unsplash.com/photo-1609091839311-d5365f9ff1c5?w=400&h=300&fit=crop",
|
| 65 |
+
"Speaker": "https://images.unsplash.com/photo-1608043152269-423dbba4e7e1?w=400&h=300&fit=crop",
|
| 66 |
+
"Lamp": "https://images.unsplash.com/photo-1507473885765-e6ed057ab6fe?w=400&h=300&fit=crop",
|
| 67 |
+
"Power Bank": "https://images.unsplash.com/photo-1609091839311-d5365f9ff1c5?w=400&h=300&fit=crop",
|
| 68 |
+
"Mug": "https://images.unsplash.com/photo-1514228742587-6b1558fcca3d?w=400&h=300&fit=crop",
|
| 69 |
+
"Weekender": "https://images.unsplash.com/photo-1590874103328-eac38a683ce7?w=400&h=300&fit=crop",
|
| 70 |
+
"Overcoat": "https://images.unsplash.com/photo-1544923246-77307dd270b5?w=400&h=300&fit=crop",
|
| 71 |
+
"Wallet": "https://images.unsplash.com/photo-1627123424574-724758594e93?w=400&h=300&fit=crop",
|
| 72 |
+
"Belt": "https://images.unsplash.com/photo-1553062407-98eeb64c6a62?w=400&h=300&fit=crop",
|
| 73 |
+
"Candle": "https://images.unsplash.com/photo-1602607616777-b8fb tried?w=400&h=300&fit=crop",
|
| 74 |
+
"Blanket": "https://images.unsplash.com/photo-1555041469-a586c61ea9bc?w=400&h=300&fit=crop",
|
| 75 |
+
"Clock": "https://images.unsplash.com/photo-1563861826100-9cb868fdbe1c?w=400&h=300&fit=crop",
|
| 76 |
+
"Towel": "https://images.unsplash.com/photo-1583845112203-29329902332e?w=400&h=300&fit=crop",
|
| 77 |
+
"Hoodie": "https://images.unsplash.com/photo-1556821840-3a63f95609a7?w=400&h=300&fit=crop",
|
| 78 |
+
"Chino": "https://images.unsplash.com/photo-1473966968600-fa801b869a1a?w=400&h=300&fit=crop",
|
| 79 |
+
"Crossbody": "https://images.unsplash.com/photo-1590874103328-eac38a683ce7?w=400&h=300&fit=crop",
|
| 80 |
+
"Socks": "https://images.unsplash.com/photo-1586350977771-b3b0abd50c82?w=400&h=300&fit=crop",
|
| 81 |
+
"Basketball": "https://images.unsplash.com/photo-1546519638-68e109498ffc?w=400&h=300&fit=crop",
|
| 82 |
+
"Jersey": "https://images.unsplash.com/photo-1565299624946-b28f40a0ae38?w=400&h=300&fit=crop",
|
| 83 |
+
"Cushion": "https://images.unsplash.com/photo-1555041469-a586c61ea9bc?w=400&h=300&fit=crop",
|
| 84 |
+
"Planter": "https://images.unsplash.com/photo-1459411552884-841db9b3cc2a?w=400&h=300&fit=crop",
|
| 85 |
+
"Organizer": "https://images.unsplash.com/photo-1507473885765-e6ed057ab6fe?w=400&h=300&fit=crop",
|
| 86 |
+
"Pour-Over": "https://images.unsplash.com/photo-1495474472287-4d71bcdd2085?w=400&h=300&fit=crop",
|
| 87 |
+
}
|
| 88 |
+
|
| 89 |
+
DEFAULT_IMG = "https://images.unsplash.com/photo-1472851294608-062f124dcb02?w=400&h=300&fit=crop"
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
def _get_product_image(title: str) -> str:
|
| 93 |
+
"""Map product title → curated Unsplash photo."""
|
| 94 |
+
for key, url in IMAGE_MAP.items():
|
| 95 |
+
if key.lower() in title.lower():
|
| 96 |
+
return url
|
| 97 |
+
return DEFAULT_IMG
|
| 98 |
+
|
| 99 |
+
|
| 100 |
+
# ── Plotly drift chart ────────────────────────────────────────────────────
|
| 101 |
+
|
| 102 |
+
def _plot_drift() -> go.Figure:
|
| 103 |
+
series = detector.get_history_series()
|
| 104 |
+
ewma = detector.get_ewma_scores()
|
| 105 |
+
fig = go.Figure()
|
| 106 |
+
|
| 107 |
+
colors = {"price_sensitive": "#f59e0b", "summer_shift": "#06b6d4", "eco_trend": "#10b981"}
|
| 108 |
+
labels = {"price_sensitive": "Price Sensitivity", "summer_shift": "Summer Shift", "eco_trend": "Eco Trend"}
|
| 109 |
+
|
| 110 |
+
for concept in series:
|
| 111 |
+
data = series[concept][-30:] # last 30 data points
|
| 112 |
+
fig.add_trace(go.Scatter(
|
| 113 |
+
y=data,
|
| 114 |
+
mode="lines",
|
| 115 |
+
name=labels.get(concept, concept),
|
| 116 |
+
line=dict(color=colors.get(concept, "#fff"), width=2.5, shape="spline"),
|
| 117 |
+
fill="tozeroy",
|
| 118 |
+
fillcolor=colors.get(concept, "#fff").replace(")", ", 0.08)").replace("rgb", "rgba") if "rgb" in colors.get(concept, "") else f"rgba(255,255,255,0.05)",
|
| 119 |
+
))
|
| 120 |
+
|
| 121 |
+
# Threshold line
|
| 122 |
+
fig.add_hline(y=0.38, line_dash="dot", line_color="rgba(255,255,255,0.3)",
|
| 123 |
+
annotation_text="Threshold", annotation_font_color="rgba(255,255,255,0.4)")
|
| 124 |
+
|
| 125 |
+
fig.update_layout(
|
| 126 |
+
height=240,
|
| 127 |
+
margin=dict(l=0, r=0, t=10, b=0),
|
| 128 |
+
plot_bgcolor="rgba(0,0,0,0)",
|
| 129 |
+
paper_bgcolor="rgba(0,0,0,0)",
|
| 130 |
+
font=dict(color="#94a3b8", size=11),
|
| 131 |
+
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="center", x=0.5,
|
| 132 |
+
font=dict(size=10)),
|
| 133 |
+
xaxis=dict(showgrid=False, showticklabels=False),
|
| 134 |
+
yaxis=dict(showgrid=True, gridwidth=1, gridcolor="rgba(255,255,255,0.06)",
|
| 135 |
+
range=[0, 0.8]),
|
| 136 |
+
)
|
| 137 |
+
return fig
|
| 138 |
+
|
| 139 |
+
|
| 140 |
+
# ── Product cards HTML ────────────────────────────────────────────────────
|
| 141 |
+
|
| 142 |
+
def _build_product_html(retrieved: list[dict]) -> str:
|
| 143 |
+
if not retrieved:
|
| 144 |
+
return _empty_catalog_html()
|
| 145 |
+
|
| 146 |
+
cards = []
|
| 147 |
+
for r in retrieved:
|
| 148 |
+
p = r["product"]
|
| 149 |
+
score = r["score"]
|
| 150 |
+
img = _get_product_image(p["title"])
|
| 151 |
+
stars_full = int(p.get("rating", 4))
|
| 152 |
+
stars_html = "★" * stars_full + "☆" * (5 - stars_full)
|
| 153 |
+
reviews = p.get("reviews", 0)
|
| 154 |
+
score_pct = int(score * 100)
|
| 155 |
+
tags_html = "".join(
|
| 156 |
+
f"<span style='background:rgba(99,102,241,0.15); color:#818cf8; padding:2px 8px; "
|
| 157 |
+
f"border-radius:20px; font-size:10px; margin-right:4px;'>{t}</span>"
|
| 158 |
+
for t in p.get("tags", [])[:3]
|
| 159 |
+
)
|
| 160 |
+
|
| 161 |
+
cards.append(f"""
|
| 162 |
+
<div style='background:rgba(255,255,255,0.03); border:1px solid rgba(255,255,255,0.08);
|
| 163 |
+
border-radius:16px; overflow:hidden; transition:all 0.3s ease;
|
| 164 |
+
box-shadow:0 4px 20px rgba(0,0,0,0.3);'>
|
| 165 |
+
<div style='position:relative;'>
|
| 166 |
+
<img src='{img}' style='width:100%; height:150px; object-fit:cover;
|
| 167 |
+
border-bottom:1px solid rgba(255,255,255,0.06);'
|
| 168 |
+
onerror="this.src='{DEFAULT_IMG}'" />
|
| 169 |
+
<div style='position:absolute; top:8px; right:8px; background:rgba(0,0,0,0.75);
|
| 170 |
+
color:#f8fafc; padding:3px 10px; border-radius:20px; font-size:13px;
|
| 171 |
+
font-weight:700; backdrop-filter:blur(8px);
|
| 172 |
+
border:1px solid rgba(255,255,255,0.15);'>
|
| 173 |
+
${p['price']:.2f}
|
| 174 |
+
</div>
|
| 175 |
+
<div style='position:absolute; top:8px; left:8px; background:rgba(99,102,241,0.85);
|
| 176 |
+
color:white; padding:2px 8px; border-radius:20px; font-size:10px;
|
| 177 |
+
font-weight:600; letter-spacing:0.5px;'>
|
| 178 |
+
{score_pct}% match
|
| 179 |
+
</div>
|
| 180 |
+
</div>
|
| 181 |
+
<div style='padding:14px;'>
|
| 182 |
+
<div style='color:#f1f5f9; font-size:14px; font-weight:600;
|
| 183 |
+
margin-bottom:4px; line-height:1.3;'>{p['title']}</div>
|
| 184 |
+
<div style='display:flex; align-items:center; gap:6px; margin-bottom:6px;'>
|
| 185 |
+
<span style='color:#fbbf24; font-size:12px; letter-spacing:1px;'>{stars_html}</span>
|
| 186 |
+
<span style='color:#64748b; font-size:11px;'>({reviews:,})</span>
|
| 187 |
+
</div>
|
| 188 |
+
<div style='margin-bottom:8px;'>{tags_html}</div>
|
| 189 |
+
<p style='color:#94a3b8; font-size:12px; line-height:1.4; margin:0;'>
|
| 190 |
+
{p['desc'][:100]}…
|
| 191 |
+
</p>
|
| 192 |
+
</div>
|
| 193 |
+
</div>
|
| 194 |
+
""")
|
| 195 |
+
|
| 196 |
+
return f"""
|
| 197 |
+
<div style='display:grid; grid-template-columns:1fr 1fr; gap:16px; padding:8px;'>
|
| 198 |
+
{''.join(cards)}
|
| 199 |
+
</div>
|
| 200 |
+
"""
|
| 201 |
+
|
| 202 |
+
|
| 203 |
+
def _empty_catalog_html() -> str:
|
| 204 |
+
return """
|
| 205 |
+
<div style='padding:60px 30px; text-align:center; color:#475569;
|
| 206 |
+
border:2px dashed rgba(255,255,255,0.08); border-radius:20px; margin:16px;'>
|
| 207 |
+
<div style='font-size:2.5rem; margin-bottom:12px;'>🛍️</div>
|
| 208 |
+
<div style='font-size:1.1rem; font-weight:500; color:#64748b;'>Awaiting your query…</div>
|
| 209 |
+
<div style='font-size:0.85rem; color:#475569; margin-top:6px;'>
|
| 210 |
+
Try a scenario below or type your own question
|
| 211 |
+
</div>
|
| 212 |
+
</div>
|
| 213 |
+
"""
|
| 214 |
+
|
| 215 |
+
|
| 216 |
+
# ── Main query handler ────────────────────────────────────────────────────
|
| 217 |
+
|
| 218 |
+
def process_query(query: str, history: list):
|
| 219 |
+
if not query or not query.strip():
|
| 220 |
+
return "", history, _plot_drift(), "", "—", _empty_catalog_html()
|
| 221 |
+
|
| 222 |
+
logger.info("Processing query: %r", query)
|
| 223 |
+
|
| 224 |
+
# 1. Measure drift
|
| 225 |
+
drift_state, scores = detector.analyze_drift(query)
|
| 226 |
+
|
| 227 |
+
# 2. Retrieve products (hybrid: price-filter + semantic)
|
| 228 |
+
retrieved = retriever.search(query, top_k=4)
|
| 229 |
+
|
| 230 |
+
# 3. Adapt system prompt
|
| 231 |
+
system_prompt = adapter.adapt_prompt(drift_state)
|
| 232 |
+
explanation = adapter.get_explanation(drift_state)
|
| 233 |
+
label = adapter.get_label(drift_state)
|
| 234 |
+
|
| 235 |
+
# 4. Generate LLM response
|
| 236 |
+
response = generate_response(system_prompt, query, retrieved)
|
| 237 |
+
|
| 238 |
+
history = history or []
|
| 239 |
+
history.append({"role": "user", "content": query})
|
| 240 |
+
history.append({"role": "assistant", "content": response})
|
| 241 |
+
|
| 242 |
+
return "", history, _plot_drift(), explanation, label, _build_product_html(retrieved)
|
| 243 |
+
|
| 244 |
+
|
| 245 |
+
def load_example(example_text: str) -> str:
|
| 246 |
+
return example_text
|
| 247 |
+
|
| 248 |
+
|
| 249 |
+
# ══════════════════════════════════════════════════════════════════════════
|
| 250 |
+
# UI Definition
|
| 251 |
+
# ══════════════════════════════════════════════════════════════════════════
|
| 252 |
+
|
| 253 |
+
css = """
|
| 254 |
+
@import url('https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700;800&display=swap');
|
| 255 |
+
|
| 256 |
+
body, .gradio-container {
|
| 257 |
+
font-family: 'Inter', system-ui, -apple-system, sans-serif !important;
|
| 258 |
+
background: #0a0f1a !important;
|
| 259 |
+
}
|
| 260 |
+
|
| 261 |
+
/* Header */
|
| 262 |
+
.hero-header {
|
| 263 |
+
text-align: center;
|
| 264 |
+
padding: 2.5rem 2rem 1.5rem;
|
| 265 |
+
background: linear-gradient(135deg, rgba(15,23,42,0.95) 0%, rgba(30,41,59,0.6) 50%, rgba(15,23,42,0.95) 100%);
|
| 266 |
+
border-radius: 24px;
|
| 267 |
+
border: 1px solid rgba(255,255,255,0.06);
|
| 268 |
+
box-shadow: 0 25px 60px rgba(0,0,0,0.5);
|
| 269 |
+
position: relative;
|
| 270 |
+
overflow: hidden;
|
| 271 |
+
margin-bottom: 1.5rem;
|
| 272 |
+
}
|
| 273 |
+
.hero-header::before {
|
| 274 |
+
content: '';
|
| 275 |
+
position: absolute;
|
| 276 |
+
top: -50%;
|
| 277 |
+
left: -50%;
|
| 278 |
+
width: 200%;
|
| 279 |
+
height: 200%;
|
| 280 |
+
background: radial-gradient(circle at 30% 50%, rgba(99,102,241,0.08) 0%, transparent 50%),
|
| 281 |
+
radial-gradient(circle at 70% 50%, rgba(6,182,212,0.06) 0%, transparent 50%);
|
| 282 |
+
animation: aurora 8s ease-in-out infinite alternate;
|
| 283 |
+
}
|
| 284 |
+
@keyframes aurora {
|
| 285 |
+
0% { transform: translate(0, 0) rotate(0deg); }
|
| 286 |
+
100% { transform: translate(-5%, 5%) rotate(3deg); }
|
| 287 |
+
}
|
| 288 |
+
.hero-title {
|
| 289 |
+
font-size: 2.8rem;
|
| 290 |
+
font-weight: 800;
|
| 291 |
+
background: linear-gradient(135deg, #818cf8 0%, #06b6d4 50%, #10b981 100%);
|
| 292 |
+
-webkit-background-clip: text;
|
| 293 |
+
-webkit-text-fill-color: transparent;
|
| 294 |
+
margin: 0;
|
| 295 |
+
position: relative;
|
| 296 |
+
letter-spacing: -0.5px;
|
| 297 |
+
}
|
| 298 |
+
.hero-sub {
|
| 299 |
+
color: #64748b;
|
| 300 |
+
font-size: 0.95rem;
|
| 301 |
+
letter-spacing: 3px;
|
| 302 |
+
text-transform: uppercase;
|
| 303 |
+
font-weight: 500;
|
| 304 |
+
margin-top: 0.5rem;
|
| 305 |
+
position: relative;
|
| 306 |
+
}
|
| 307 |
+
.hero-badges {
|
| 308 |
+
display: flex;
|
| 309 |
+
justify-content: center;
|
| 310 |
+
gap: 12px;
|
| 311 |
+
margin-top: 1rem;
|
| 312 |
+
position: relative;
|
| 313 |
+
flex-wrap: wrap;
|
| 314 |
+
}
|
| 315 |
+
.hero-badge {
|
| 316 |
+
background: rgba(255,255,255,0.04);
|
| 317 |
+
border: 1px solid rgba(255,255,255,0.08);
|
| 318 |
+
color: #94a3b8;
|
| 319 |
+
padding: 4px 14px;
|
| 320 |
+
border-radius: 20px;
|
| 321 |
+
font-size: 0.75rem;
|
| 322 |
+
font-weight: 500;
|
| 323 |
+
letter-spacing: 0.5px;
|
| 324 |
+
}
|
| 325 |
+
|
| 326 |
+
/* Panels */
|
| 327 |
+
.glass-panel {
|
| 328 |
+
background: rgba(15, 23, 42, 0.6) !important;
|
| 329 |
+
border: 1px solid rgba(255,255,255,0.06) !important;
|
| 330 |
+
border-radius: 20px !important;
|
| 331 |
+
backdrop-filter: blur(12px) !important;
|
| 332 |
+
}
|
| 333 |
+
|
| 334 |
+
/* Scenario pills */
|
| 335 |
+
.scenario-row { display: flex; gap: 8px; flex-wrap: wrap; margin-top: 8px; }
|
| 336 |
+
|
| 337 |
+
/* Section headers */
|
| 338 |
+
.panel-header {
|
| 339 |
+
color: #e2e8f0;
|
| 340 |
+
font-size: 1rem;
|
| 341 |
+
font-weight: 600;
|
| 342 |
+
padding: 14px 16px 8px;
|
| 343 |
+
display: flex;
|
| 344 |
+
align-items: center;
|
| 345 |
+
gap: 8px;
|
| 346 |
+
}
|
| 347 |
+
|
| 348 |
+
/* Info box */
|
| 349 |
+
.info-callout {
|
| 350 |
+
background: rgba(99,102,241,0.08);
|
| 351 |
+
border: 1px solid rgba(99,102,241,0.2);
|
| 352 |
+
border-radius: 12px;
|
| 353 |
+
padding: 12px 16px;
|
| 354 |
+
color: #a5b4fc;
|
| 355 |
+
font-size: 0.8rem;
|
| 356 |
+
line-height: 1.5;
|
| 357 |
+
margin: 8px 12px;
|
| 358 |
+
}
|
| 359 |
+
|
| 360 |
+
/* Hide Gradio footer */
|
| 361 |
+
footer { display: none !important; }
|
| 362 |
+
"""
|
| 363 |
+
|
| 364 |
+
with gr.Blocks(css=css, theme=gr.themes.Base(), title="RetailMind — Self-Healing AI") as app:
|
| 365 |
+
|
| 366 |
+
# ── Header ────────────────────────────────────────────────────
|
| 367 |
+
gr.HTML("""
|
| 368 |
+
<div class="hero-header">
|
| 369 |
+
<h1 class="hero-title">RetailMind</h1>
|
| 370 |
+
<p class="hero-sub">Self-Healing LLM · Store Intelligence</p>
|
| 371 |
+
<div class="hero-badges">
|
| 372 |
+
<span class="hero-badge">🧠 Semantic Drift Detection</span>
|
| 373 |
+
<span class="hero-badge">🔄 Autonomous Prompt Healing</span>
|
| 374 |
+
<span class="hero-badge">🔍 Hybrid RAG Retrieval</span>
|
| 375 |
+
<span class="hero-badge">📊 Real-Time Telemetry</span>
|
| 376 |
+
</div>
|
| 377 |
+
</div>
|
| 378 |
+
""")
|
| 379 |
+
|
| 380 |
+
with gr.Row():
|
| 381 |
+
# ── LEFT: Chat Panel ─────────────────────────────────────
|
| 382 |
+
with gr.Column(scale=4, elem_classes=["glass-panel"]):
|
| 383 |
+
gr.HTML("<div class='panel-header'>💬 AI Shopping Assistant</div>")
|
| 384 |
+
chatbot = gr.Chatbot(
|
| 385 |
+
height=420,
|
| 386 |
+
container=False,
|
| 387 |
+
show_copy_button=True,
|
| 388 |
+
placeholder="Ask me about products, deals, or seasonal picks…",
|
| 389 |
+
)
|
| 390 |
+
with gr.Row():
|
| 391 |
+
msg = gr.Textbox(
|
| 392 |
+
placeholder="e.g. Find me eco-friendly running shoes under $120…",
|
| 393 |
+
show_label=False,
|
| 394 |
+
container=False,
|
| 395 |
+
scale=8,
|
| 396 |
+
)
|
| 397 |
+
submit = gr.Button("Search", variant="primary", scale=2)
|
| 398 |
+
|
| 399 |
+
gr.HTML("""
|
| 400 |
+
<div class='info-callout'>
|
| 401 |
+
💡 <b>Demo tip:</b> Click the scenario buttons below in order
|
| 402 |
+
(Phase 1 → 4) to watch the system detect intent drift and
|
| 403 |
+
autonomously heal its behavior in real time.
|
| 404 |
+
</div>
|
| 405 |
+
""")
|
| 406 |
+
|
| 407 |
+
for scenario_name, queries in scenarios.items():
|
| 408 |
+
with gr.Accordion(scenario_name, open=False):
|
| 409 |
+
for q in queries:
|
| 410 |
+
btn = gr.Button(q, size="sm", variant="secondary")
|
| 411 |
+
btn.click(fn=load_example, inputs=btn, outputs=msg)
|
| 412 |
+
|
| 413 |
+
# ── MIDDLE: Product Feed ─────────────────────────────────
|
| 414 |
+
with gr.Column(scale=4, elem_classes=["glass-panel"]):
|
| 415 |
+
gr.HTML("<div class='panel-header'>🛍️ Retrieved Products</div>")
|
| 416 |
+
retrieved_box = gr.HTML(value=_empty_catalog_html())
|
| 417 |
+
|
| 418 |
+
# ── RIGHT: MLOps Telemetry ───────────────────────────────
|
| 419 |
+
with gr.Column(scale=3, elem_classes=["glass-panel"]):
|
| 420 |
+
gr.HTML("<div class='panel-header'>⚡ MLOps Telemetry</div>")
|
| 421 |
+
|
| 422 |
+
current_phase = gr.Textbox(
|
| 423 |
+
label="Active Semantic State",
|
| 424 |
+
value="⚖️ Balanced Mode",
|
| 425 |
+
interactive=False,
|
| 426 |
+
)
|
| 427 |
+
|
| 428 |
+
drift_plot = gr.Plot(value=_plot_drift())
|
| 429 |
+
|
| 430 |
+
gr.HTML("""
|
| 431 |
+
<div class='info-callout'>
|
| 432 |
+
📈 The chart above tracks <b>EWMA-smoothed</b> semantic
|
| 433 |
+
similarity between user queries and concept anchors
|
| 434 |
+
(price, season, eco). When a line crosses the dotted
|
| 435 |
+
threshold, the system <b>autonomously rewrites</b> its
|
| 436 |
+
own instructions.
|
| 437 |
+
</div>
|
| 438 |
+
""")
|
| 439 |
+
|
| 440 |
+
gr.HTML("<div class='panel-header'>🧠 Self-Healing Log</div>")
|
| 441 |
+
explanation_box = gr.Textbox(
|
| 442 |
+
label="Adaptation Status",
|
| 443 |
+
interactive=False,
|
| 444 |
+
lines=6,
|
| 445 |
+
value=(
|
| 446 |
+
"📊 System Status: Normal\n"
|
| 447 |
+
"━━━━━━━━━━━━━━━━━━━━━━━━━━\n"
|
| 448 |
+
"No significant drift detected.\n"
|
| 449 |
+
"System prompt: Default balanced mode.\n"
|
| 450 |
+
"All EWMA concept scores below threshold (0.38)."
|
| 451 |
+
),
|
| 452 |
+
)
|
| 453 |
+
|
| 454 |
+
# ── Event wiring ──────────────────────────────────────────────
|
| 455 |
+
submit.click(
|
| 456 |
+
process_query,
|
| 457 |
+
inputs=[msg, chatbot],
|
| 458 |
+
outputs=[msg, chatbot, drift_plot, explanation_box, current_phase, retrieved_box],
|
| 459 |
+
)
|
| 460 |
+
msg.submit(
|
| 461 |
+
process_query,
|
| 462 |
+
inputs=[msg, chatbot],
|
| 463 |
+
outputs=[msg, chatbot, drift_plot, explanation_box, current_phase, retrieved_box],
|
| 464 |
+
)
|
| 465 |
+
|
| 466 |
+
|
| 467 |
+
if __name__ == "__main__":
|
| 468 |
+
app.launch(server_name="0.0.0.0", share=True)
|
modules/__init__.py
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
"""RetailMind modules."""
|
modules/adaptation.py
ADDED
|
@@ -0,0 +1,141 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Self-healing prompt adapter for RetailMind.
|
| 3 |
+
|
| 4 |
+
Dynamically rewrites the LLM system prompt based on detected semantic drift.
|
| 5 |
+
This is the "self-healing" core — the system adapts its behavior in real time
|
| 6 |
+
without human intervention when it detects shifting user intent patterns.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from __future__ import annotations
|
| 10 |
+
|
| 11 |
+
import logging
|
| 12 |
+
from dataclasses import dataclass
|
| 13 |
+
|
| 14 |
+
logger = logging.getLogger(__name__)
|
| 15 |
+
|
| 16 |
+
_BASE_PROMPT = (
|
| 17 |
+
"You are RetailMind, a knowledgeable and friendly AI shopping assistant for "
|
| 18 |
+
"an online retail store. You help customers find the perfect products from "
|
| 19 |
+
"our catalog.\n\n"
|
| 20 |
+
"RULES:\n"
|
| 21 |
+
"1. ONLY recommend products that appear in the 'Available Inventory' below.\n"
|
| 22 |
+
"2. Always mention the exact product name and price.\n"
|
| 23 |
+
"3. Keep responses concise (3–5 sentences) but helpful.\n"
|
| 24 |
+
"4. If a product matches the customer's needs, explain WHY it's a good fit.\n"
|
| 25 |
+
"5. Never invent products that aren't in the inventory list."
|
| 26 |
+
)
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
@dataclass
|
| 30 |
+
class AdaptationRule:
|
| 31 |
+
"""A single self-healing rule triggered by a drift concept."""
|
| 32 |
+
|
| 33 |
+
concept: str
|
| 34 |
+
label: str
|
| 35 |
+
prompt_injection: str
|
| 36 |
+
explanation: str
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
# Pre-defined adaptation rules — each maps a drift signal to a prompt mutation
|
| 40 |
+
_RULES: dict[str, AdaptationRule] = {
|
| 41 |
+
"price_sensitive": AdaptationRule(
|
| 42 |
+
concept="price_sensitive",
|
| 43 |
+
label="💰 Price-Sensitive Mode",
|
| 44 |
+
prompt_injection=(
|
| 45 |
+
"\n\n⚠️ ACTIVE ADAPTATION — PRICE SENSITIVITY DETECTED:\n"
|
| 46 |
+
"Customer intent analysis shows strong budget-consciousness. "
|
| 47 |
+
"You MUST:\n"
|
| 48 |
+
"• Lead with the cheapest matching products first.\n"
|
| 49 |
+
"• Explicitly state the price and any savings.\n"
|
| 50 |
+
"• Compare price-to-value across options.\n"
|
| 51 |
+
"• Mention if an item is the lowest-priced in its category."
|
| 52 |
+
),
|
| 53 |
+
explanation=(
|
| 54 |
+
"🔧 Self-Healing Activated\n"
|
| 55 |
+
"━━━━━━━━━━━━━━━━━━━━━━━━━━\n"
|
| 56 |
+
"Signal: Price-sensitive keyword drift detected (budget, cheap, under $X)\n"
|
| 57 |
+
"Action: Injected price-prioritization directives into system prompt\n"
|
| 58 |
+
"Effect: LLM now ranks by price-to-value instead of general relevance\n"
|
| 59 |
+
"Trigger: EWMA score exceeded threshold (0.38)"
|
| 60 |
+
),
|
| 61 |
+
),
|
| 62 |
+
"summer_shift": AdaptationRule(
|
| 63 |
+
concept="summer_shift",
|
| 64 |
+
label="☀️ Summer Season Mode",
|
| 65 |
+
prompt_injection=(
|
| 66 |
+
"\n\n⚠️ ACTIVE ADAPTATION — SEASONAL SHIFT DETECTED:\n"
|
| 67 |
+
"Query patterns indicate a seasonal shift toward summer. "
|
| 68 |
+
"You MUST:\n"
|
| 69 |
+
"• Prioritize lightweight, breathable, and warm-weather products.\n"
|
| 70 |
+
"• Highlight UV protection and heat-management features.\n"
|
| 71 |
+
"• De-prioritize winter and cold-weather items.\n"
|
| 72 |
+
"• Mention materials suited for hot climates (linen, mesh, moisture-wicking)."
|
| 73 |
+
),
|
| 74 |
+
explanation=(
|
| 75 |
+
"🔧 Self-Healing Activated\n"
|
| 76 |
+
"━━━━━━━━━━━━━━━━━━━━━━━━━━\n"
|
| 77 |
+
"Signal: Seasonal semantic shift detected (summer, beach, UV, lightweight)\n"
|
| 78 |
+
"Action: Injected warm-weather prioritization into system prompt\n"
|
| 79 |
+
"Effect: LLM now filters for breathable materials and summer categories\n"
|
| 80 |
+
"Trigger: EWMA score exceeded threshold (0.38)"
|
| 81 |
+
),
|
| 82 |
+
),
|
| 83 |
+
"eco_trend": AdaptationRule(
|
| 84 |
+
concept="eco_trend",
|
| 85 |
+
label="🌿 Eco-Conscious Mode",
|
| 86 |
+
prompt_injection=(
|
| 87 |
+
"\n\n⚠️ ACTIVE ADAPTATION — SUSTAINABILITY TREND DETECTED:\n"
|
| 88 |
+
"User intent strongly favors eco-friendly products. "
|
| 89 |
+
"You MUST:\n"
|
| 90 |
+
"• Lead with recycled, organic, and plant-based items.\n"
|
| 91 |
+
"• Highlight environmental certifications (GOTS, OEKO-TEX).\n"
|
| 92 |
+
"• Explain the sustainability story behind each recommendation.\n"
|
| 93 |
+
"• Mention materials: recycled ocean plastic, organic cotton, bamboo, cork."
|
| 94 |
+
),
|
| 95 |
+
explanation=(
|
| 96 |
+
"🔧 Self-Healing Activated\n"
|
| 97 |
+
"━━━━━━━━━━━━━━━━━━━━━━━━━━\n"
|
| 98 |
+
"Signal: Eco-conscious trend detected (sustainable, recycled, organic)\n"
|
| 99 |
+
"Action: Injected sustainability-first directives into system prompt\n"
|
| 100 |
+
"Effect: LLM now leads with eco-credentials and material sourcing\n"
|
| 101 |
+
"Trigger: EWMA score exceeded threshold (0.38)"
|
| 102 |
+
),
|
| 103 |
+
),
|
| 104 |
+
}
|
| 105 |
+
|
| 106 |
+
_NORMAL_EXPLANATION = (
|
| 107 |
+
"📊 System Status: Normal\n"
|
| 108 |
+
"━━━━━━━━━━━━━━━━━━━━━━━━━━\n"
|
| 109 |
+
"No significant drift detected in user intent patterns.\n"
|
| 110 |
+
"System prompt: Default balanced recommendation mode.\n"
|
| 111 |
+
"All EWMA concept scores below threshold (0.38)."
|
| 112 |
+
)
|
| 113 |
+
|
| 114 |
+
|
| 115 |
+
class Adapter:
|
| 116 |
+
"""Stateless prompt adapter — maps drift signals to prompt mutations."""
|
| 117 |
+
|
| 118 |
+
def __init__(self) -> None:
|
| 119 |
+
self.base_prompt: str = _BASE_PROMPT
|
| 120 |
+
self._active_rule: AdaptationRule | None = None
|
| 121 |
+
|
| 122 |
+
def adapt_prompt(self, drift_state: str) -> str:
|
| 123 |
+
"""Return the adapted system prompt for the current drift state."""
|
| 124 |
+
rule = _RULES.get(drift_state)
|
| 125 |
+
self._active_rule = rule
|
| 126 |
+
|
| 127 |
+
if rule:
|
| 128 |
+
logger.info("Adaptation triggered: %s", rule.label)
|
| 129 |
+
return self.base_prompt + rule.prompt_injection
|
| 130 |
+
|
| 131 |
+
return self.base_prompt + "\n\nProvide balanced recommendations covering a mix of features, prices, and styles."
|
| 132 |
+
|
| 133 |
+
def get_explanation(self, drift_state: str) -> str:
|
| 134 |
+
"""Human-readable explanation of what the adapter did and why."""
|
| 135 |
+
rule = _RULES.get(drift_state)
|
| 136 |
+
return rule.explanation if rule else _NORMAL_EXPLANATION
|
| 137 |
+
|
| 138 |
+
def get_label(self, drift_state: str) -> str:
|
| 139 |
+
"""Short UI label for the active state."""
|
| 140 |
+
rule = _RULES.get(drift_state)
|
| 141 |
+
return rule.label if rule else "⚖️ Balanced Mode"
|
modules/data_simulation.py
ADDED
|
@@ -0,0 +1,318 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Synthetic product catalog generator for RetailMind.
|
| 3 |
+
|
| 4 |
+
Generates a curated catalog of ~200 realistic e-commerce products with rich
|
| 5 |
+
descriptions, material specs, star ratings, and semantic tags — designed to
|
| 6 |
+
produce high-quality embeddings for dense retrieval.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
import random
|
| 10 |
+
from typing import TypedDict
|
| 11 |
+
|
| 12 |
+
random.seed(42) # Reproducible catalog across sessions
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
class Product(TypedDict):
|
| 16 |
+
id: int
|
| 17 |
+
title: str
|
| 18 |
+
category: str
|
| 19 |
+
price: float
|
| 20 |
+
desc: str
|
| 21 |
+
tags: list[str]
|
| 22 |
+
rating: float
|
| 23 |
+
reviews: int
|
| 24 |
+
materials: str
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
# ---------------------------------------------------------------------------
|
| 28 |
+
# Hand-authored product templates — each with unique, embedding-rich content
|
| 29 |
+
# ---------------------------------------------------------------------------
|
| 30 |
+
|
| 31 |
+
_TEMPLATES: list[dict] = [
|
| 32 |
+
# ── Winter ──────────────────────────────────────────────────────────────
|
| 33 |
+
{"title": "Alpine Pro Insulated Parka", "category": "winter", "price": 189.99,
|
| 34 |
+
"desc": "Engineered for sub-zero temperatures with 700-fill goose down insulation and a waterproof shell. Features an adjustable storm hood, internal media pocket, and reflective accents for low-light visibility. Wind-rated to -30°F.",
|
| 35 |
+
"tags": ["waterproof", "insulated", "cold-weather", "outdoor"], "materials": "Nylon ripstop shell, goose down fill"},
|
| 36 |
+
{"title": "Fireside Merino Wool Sweater", "category": "winter", "price": 79.99,
|
| 37 |
+
"desc": "A classic crewneck knit from ultra-soft 100% merino wool. Breathable yet warm, perfect for layering or wearing solo by the fire. Naturally odor-resistant and temperature-regulating.",
|
| 38 |
+
"tags": ["wool", "layering", "classic", "cozy"], "materials": "100% Merino wool"},
|
| 39 |
+
{"title": "Glacier Grip Thermal Gloves", "category": "winter", "price": 34.99,
|
| 40 |
+
"desc": "Touchscreen-compatible thermal gloves with silicone grip palms. Fleece-lined interior keeps hands warm while conductive fingertips let you use your phone without exposing skin to the cold.",
|
| 41 |
+
"tags": ["touchscreen", "thermal", "cold-weather", "tech-friendly"], "materials": "Polyester fleece, silicone grip, conductive thread"},
|
| 42 |
+
{"title": "Blizzard Shield Snow Boots", "category": "winter", "price": 149.99,
|
| 43 |
+
"desc": "Heavy-duty winter boots with Thinsulate insulation and Vibram Arctic Grip outsoles. Sealed seams and a gusseted tongue keep snow and slush out. Comfort-rated to -40°F.",
|
| 44 |
+
"tags": ["waterproof", "insulated", "snow", "hiking"], "materials": "Full-grain leather, Thinsulate, Vibram sole"},
|
| 45 |
+
{"title": "Nordic Knit Beanie", "category": "winter", "price": 24.99,
|
| 46 |
+
"desc": "Double-layer acrylic knit beanie with a fleece headband liner. Classic Nordic pattern adds style while the snug fit traps heat. One size fits most.",
|
| 47 |
+
"tags": ["knit", "warm", "casual", "unisex"], "materials": "Acrylic knit, polyester fleece liner"},
|
| 48 |
+
{"title": "Summit Fleece Pullover", "category": "winter", "price": 64.99,
|
| 49 |
+
"desc": "Mid-weight microfleece pullover ideal for layering under a shell or wearing on cool autumn mornings. Quarter-zip design, chin guard, and zippered chest pocket.",
|
| 50 |
+
"tags": ["fleece", "layering", "outdoor", "mid-weight"], "materials": "100% recycled polyester microfleece"},
|
| 51 |
+
{"title": "Thermal Base Layer Set", "category": "winter", "price": 54.99,
|
| 52 |
+
"desc": "Moisture-wicking thermal top and leggings designed as a first layer for skiing, snowboarding, or cold commutes. Flatlock seams prevent chafing during all-day wear.",
|
| 53 |
+
"tags": ["base-layer", "moisture-wicking", "skiing", "thermal"], "materials": "Merino-synthetic blend"},
|
| 54 |
+
{"title": "Expedition Down Vest", "category": "winter", "price": 109.99,
|
| 55 |
+
"desc": "Packable 650-fill down vest that compresses into its own pocket. Provides core warmth without restricting arm movement — perfect for active winter pursuits or travel.",
|
| 56 |
+
"tags": ["packable", "down", "layering", "travel"], "materials": "Water-resistant nylon, 650-fill duck down"},
|
| 57 |
+
|
| 58 |
+
# ── Summer ──────────────────────────────────────────────────────────────
|
| 59 |
+
{"title": "Breeze Runner Mesh Sneakers", "category": "summer", "price": 89.99,
|
| 60 |
+
"desc": "Ultra-breathable mesh upper with a responsive foam midsole. Weighs just 7.2 oz per shoe, making them ideal for hot-weather runs, gym sessions, or all-day wear in the heat.",
|
| 61 |
+
"tags": ["breathable", "lightweight", "running", "mesh"], "materials": "Engineered mesh upper, EVA foam midsole"},
|
| 62 |
+
{"title": "Pacific Coast Board Shorts", "category": "summer", "price": 39.99,
|
| 63 |
+
"desc": "Quick-dry board shorts with a 4-way stretch waistband and secure zip pocket. UPF 50+ sun protection fabric keeps you safe from UV rays during long beach days.",
|
| 64 |
+
"tags": ["quick-dry", "UPF", "beach", "swim"], "materials": "Recycled polyester, elastane blend"},
|
| 65 |
+
{"title": "Solaris UV Shield Sunglasses", "category": "summer", "price": 59.99,
|
| 66 |
+
"desc": "Polarized lenses with 100% UV400 protection in a lightweight titanium frame. Anti-glare coating reduces eye strain on bright days. Comes with a hard-shell carrying case.",
|
| 67 |
+
"tags": ["polarized", "UV-protection", "lightweight", "outdoor"], "materials": "Titanium frame, polarized polycarbonate lenses"},
|
| 68 |
+
{"title": "Coastal Breeze Linen Shirt", "category": "summer", "price": 49.99,
|
| 69 |
+
"desc": "Relaxed-fit linen button-down that stays cool in 90°F+ heat. Garment-dyed for a lived-in look. Perfect from boardwalk brunch to sunset cocktails.",
|
| 70 |
+
"tags": ["linen", "breathable", "casual", "warm-weather"], "materials": "100% French linen"},
|
| 71 |
+
{"title": "Reef Walker Sandals", "category": "summer", "price": 44.99,
|
| 72 |
+
"desc": "Contoured footbed sandals with arch support and a rugged outsole. Synthetic nubuck straps adjust for a custom fit. Great for beach walks, pool decks, and casual summer outings.",
|
| 73 |
+
"tags": ["sandals", "arch-support", "beach", "casual"], "materials": "Synthetic nubuck, molded EVA footbed"},
|
| 74 |
+
{"title": "Tropic Mesh Tank Top", "category": "summer", "price": 22.99,
|
| 75 |
+
"desc": "Lightweight mesh-back tank with moisture-wicking fabric that keeps you dry during hot workouts or humid commutes. Flatlock seams and a relaxed hem for all-day comfort.",
|
| 76 |
+
"tags": ["moisture-wicking", "gym", "breathable", "lightweight"], "materials": "Polyester-spandex blend"},
|
| 77 |
+
{"title": "Sun Shield Wide Brim Hat", "category": "summer", "price": 34.99,
|
| 78 |
+
"desc": "UPF 50+ wide-brim sun hat with an adjustable chin cord and mesh ventilation panels. Floats in water and packs flat for travel. Essential protection for hiking, fishing, and gardening.",
|
| 79 |
+
"tags": ["UPF", "sun-protection", "outdoor", "packable"], "materials": "Nylon with mesh vents"},
|
| 80 |
+
{"title": "Aqua Sport Water Shoes", "category": "summer", "price": 29.99,
|
| 81 |
+
"desc": "Drainage-port water shoes with a grippy rubber sole for rocky beaches and river crossings. Neoprene collar prevents sand entry. Dries in under an hour.",
|
| 82 |
+
"tags": ["water-shoes", "quick-dry", "beach", "outdoor"], "materials": "Mesh, neoprene, rubber outsole"},
|
| 83 |
+
|
| 84 |
+
# ── Eco-Friendly ────────────────────────────────────────────────────────
|
| 85 |
+
{"title": "EcoLoop Recycled Backpack", "category": "eco-friendly", "price": 74.99,
|
| 86 |
+
"desc": "Made from 20 recycled ocean-bound plastic bottles. Features a padded laptop sleeve, water-resistant coating, and ergonomic shoulder straps. Every purchase funds 1 lb of ocean cleanup.",
|
| 87 |
+
"tags": ["recycled", "ocean-plastic", "sustainable", "laptop"], "materials": "Recycled RPET fabric, plant-based waterproof coating"},
|
| 88 |
+
{"title": "Bamboo Hydration Bottle", "category": "eco-friendly", "price": 28.99,
|
| 89 |
+
"desc": "Double-wall vacuum insulated bottle with a natural bamboo cap and silicone seal. Keeps drinks cold for 24 hours or hot for 12. BPA-free, plastic-free, and designed to last a lifetime.",
|
| 90 |
+
"tags": ["bamboo", "BPA-free", "insulated", "reusable"], "materials": "18/8 stainless steel, bamboo lid"},
|
| 91 |
+
{"title": "Organic Cotton Classic Tee", "category": "eco-friendly", "price": 32.99,
|
| 92 |
+
"desc": "GOTS-certified organic cotton tee dyed with low-impact, water-saving dyes. Pre-shrunk ring-spun cotton feels buttery soft from the first wear. Fair Trade certified production.",
|
| 93 |
+
"tags": ["organic", "fair-trade", "GOTS-certified", "cotton"], "materials": "100% GOTS organic cotton"},
|
| 94 |
+
{"title": "Hemp Canvas Tote Bag", "category": "eco-friendly", "price": 19.99,
|
| 95 |
+
"desc": "Durable hemp canvas tote that replaces 700 single-use plastic bags in its lifetime. Reinforced seams, interior pocket, and long handles for comfortable shoulder carry.",
|
| 96 |
+
"tags": ["hemp", "reusable", "sustainable", "zero-waste"], "materials": "Organic hemp canvas"},
|
| 97 |
+
{"title": "Plant-Based Running Shoes", "category": "eco-friendly", "price": 119.99,
|
| 98 |
+
"desc": "The upper is woven from eucalyptus fiber, the midsole from sugarcane-based EVA, and the outsole from natural rubber. Carbon-negative manufacturing. Feels like running on clouds.",
|
| 99 |
+
"tags": ["plant-based", "carbon-negative", "running", "vegan"], "materials": "Eucalyptus fiber, sugarcane EVA, natural rubber"},
|
| 100 |
+
{"title": "Recycled Denim Jacket", "category": "eco-friendly", "price": 89.99,
|
| 101 |
+
"desc": "Classic trucker jacket made from 100% post-consumer recycled denim. Each jacket diverts 1.5 lbs of textile waste from landfills. Stone-washed finish with brass buttons.",
|
| 102 |
+
"tags": ["recycled", "denim", "upcycled", "sustainable"], "materials": "100% recycled post-consumer denim"},
|
| 103 |
+
{"title": "Solar-Powered Watch", "category": "eco-friendly", "price": 159.99,
|
| 104 |
+
"desc": "Never needs a battery — charges via any light source. Sapphire crystal face, titanium case, and a strap made from recycled ocean plastic. Water-resistant to 100 meters.",
|
| 105 |
+
"tags": ["solar", "recycled", "titanium", "water-resistant"], "materials": "Titanium, sapphire crystal, recycled ocean-plastic strap"},
|
| 106 |
+
{"title": "Cork Yoga Mat", "category": "eco-friendly", "price": 64.99,
|
| 107 |
+
"desc": "Harvested from sustainable cork oak forests without harming the tree. Non-slip surface improves grip when wet. Antimicrobial naturally. Backed with natural rubber for cushioning.",
|
| 108 |
+
"tags": ["cork", "sustainable", "yoga", "non-toxic"], "materials": "Natural cork, natural rubber backing"},
|
| 109 |
+
|
| 110 |
+
# ── Sports & Fitness ────────────────────────────────────────────────────
|
| 111 |
+
{"title": "ProPulse Running Shoes", "category": "sports", "price": 129.99,
|
| 112 |
+
"desc": "Carbon-plate racing shoes with a nitrogen-infused midsole for maximum energy return. Engineered mesh upper weighs just 6.5 oz. Designed for 5K to marathon distances.",
|
| 113 |
+
"tags": ["carbon-plate", "racing", "lightweight", "marathon"], "materials": "Engineered mesh, carbon fiber plate, nitrogen foam"},
|
| 114 |
+
{"title": "FlexCore Training Shorts", "category": "sports", "price": 44.99,
|
| 115 |
+
"desc": "4-way stretch training shorts with a built-in compression liner and three secure pockets. Sweat-wicking DryFit fabric keeps you cool through HIIT, lifting, and sprints.",
|
| 116 |
+
"tags": ["training", "compression", "moisture-wicking", "gym"], "materials": "Polyester-elastane with DryFit technology"},
|
| 117 |
+
{"title": "IronGrip Fitness Watch", "category": "sports", "price": 199.99,
|
| 118 |
+
"desc": "GPS-enabled multisport watch with heart rate monitoring, VO2 max estimation, and 14-day battery life. Tracks 30+ activities including swimming (waterproof to 50m). Syncs with Strava.",
|
| 119 |
+
"tags": ["GPS", "heart-rate", "waterproof", "multisport"], "materials": "Fiber-reinforced polymer case, silicone band"},
|
| 120 |
+
{"title": "Thunder Strike Basketball", "category": "sports", "price": 34.99,
|
| 121 |
+
"desc": "Official size and weight composite leather basketball with deep channel design for superior grip. Indoor/outdoor rated with a butyl bladder for consistent air retention.",
|
| 122 |
+
"tags": ["basketball", "indoor-outdoor", "official-size", "grip"], "materials": "Composite leather, butyl rubber bladder"},
|
| 123 |
+
{"title": "Velocity Compression Tights", "category": "sports", "price": 59.99,
|
| 124 |
+
"desc": "Graduated compression tights that boost blood circulation and reduce muscle fatigue during long runs. Reflective logos for night visibility. Flatlock seams prevent chafing.",
|
| 125 |
+
"tags": ["compression", "running", "reflective", "recovery"], "materials": "Nylon-spandex compression fabric"},
|
| 126 |
+
{"title": "PowerLift Training Gloves", "category": "sports", "price": 27.99,
|
| 127 |
+
"desc": "Ventilated weightlifting gloves with padded leather palms and adjustable wrist wraps. Reduces calluses while maintaining bar feel. Pull-tab for easy removal between sets.",
|
| 128 |
+
"tags": ["weightlifting", "gym", "padded", "grip"], "materials": "Genuine leather palm, mesh back, neoprene wrist wrap"},
|
| 129 |
+
{"title": "AeroFlow Cycling Jersey", "category": "sports", "price": 74.99,
|
| 130 |
+
"desc": "Full-zip cycling jersey with three rear pockets and a silicone gripper hem. Italian mesh side panels maximize airflow on climbs. Sublimation-printed — colors won't fade or peel.",
|
| 131 |
+
"tags": ["cycling", "breathable", "lightweight", "performance"], "materials": "Italian polyester mesh blend"},
|
| 132 |
+
{"title": "Endurance Hydration Pack", "category": "sports", "price": 49.99,
|
| 133 |
+
"desc": "Lightweight 2L hydration vest designed for trail running. Bite valve with on/off switch, front stash pockets for gels, and a bounce-free fit that adjusts with dual sternum straps.",
|
| 134 |
+
"tags": ["hydration", "trail-running", "lightweight", "outdoor"], "materials": "Ripstop nylon, BPA-free reservoir"},
|
| 135 |
+
|
| 136 |
+
# ── Electronics & Tech ──────────────────────────────────────────────────
|
| 137 |
+
{"title": "AuraBeats Studio Headphones", "category": "electronics", "price": 249.99,
|
| 138 |
+
"desc": "Active noise cancelling over-ear headphones with 40mm custom drivers and 30-hour battery life. Adaptive EQ auto-tunes to your ear shape. Features multipoint Bluetooth for switching between laptop and phone.",
|
| 139 |
+
"tags": ["ANC", "wireless", "bluetooth", "noise-cancelling"], "materials": "Memory foam cushions, anodized aluminum, protein leather"},
|
| 140 |
+
{"title": "NovaBand Fitness Tracker", "category": "electronics", "price": 49.99,
|
| 141 |
+
"desc": "Slim fitness band with AMOLED display, continuous heart rate monitoring, sleep tracking, and SpO2 sensor. 10-day battery life and swim-proof to 50 meters. Weighs just 22 grams.",
|
| 142 |
+
"tags": ["fitness-tracker", "AMOLED", "heart-rate", "waterproof"], "materials": "Polycarbonate case, silicone band"},
|
| 143 |
+
{"title": "TrueWireless Pro Earbuds", "category": "electronics", "price": 129.99,
|
| 144 |
+
"desc": "In-ear ANC earbuds with transparency mode and spatial audio support. 6-hour playtime per charge, 24 hours total with the wireless charging case. IPX5 sweat-resistant for workouts.",
|
| 145 |
+
"tags": ["ANC", "earbuds", "wireless", "spatial-audio"], "materials": "Medical-grade silicone tips, matte plastic shell"},
|
| 146 |
+
{"title": "Portable Solar Charger Panel", "category": "electronics", "price": 69.99,
|
| 147 |
+
"desc": "Foldable 21W solar panel with dual USB-A and USB-C outputs. Charges a phone in ~2.5 hours of direct sunlight. Carabiner attachment for backpack mounting during hikes.",
|
| 148 |
+
"tags": ["solar", "portable", "USB-C", "outdoor"], "materials": "Monocrystalline silicon, PET laminate, polyester canvas"},
|
| 149 |
+
{"title": "SmartTherm Travel Mug", "category": "electronics", "price": 39.99,
|
| 150 |
+
"desc": "App-connected travel mug with an LED temperature display on the lid. Set your preferred drinking temperature and the mug maintains it for up to 3 hours via battery-powered heating element.",
|
| 151 |
+
"tags": ["smart", "temperature-control", "travel", "app-connected"], "materials": "304 stainless steel, ceramic coating interior"},
|
| 152 |
+
{"title": "UltraSlim Power Bank 10K", "category": "electronics", "price": 34.99,
|
| 153 |
+
"desc": "10,000mAh portable charger thinner than most phones. Dual output (USB-C PD + USB-A QC3.0) charges two devices simultaneously. Fully recharges in 2.5 hours.",
|
| 154 |
+
"tags": ["power-bank", "USB-C", "portable", "fast-charging"], "materials": "Aluminum alloy shell, lithium-polymer cells"},
|
| 155 |
+
{"title": "Compact Bluetooth Speaker", "category": "electronics", "price": 44.99,
|
| 156 |
+
"desc": "IP67 waterproof and dustproof mini speaker with surprisingly rich 360° sound. 12-hour battery, built-in mic for calls, and a carabiner loop. Floats in water.",
|
| 157 |
+
"tags": ["bluetooth", "waterproof", "portable", "speaker"], "materials": "Rubberized exterior, passive bass radiator"},
|
| 158 |
+
{"title": "Night Owl LED Desk Lamp", "category": "electronics", "price": 54.99,
|
| 159 |
+
"desc": "Dimmable LED desk lamp with 5 color temperature presets and a wireless Qi charging pad in the base. Adjustable gooseneck, memory function, and a 1-hour auto-off timer.",
|
| 160 |
+
"tags": ["LED", "desk-lamp", "wireless-charging", "dimmable"], "materials": "Aluminum arm, ABS base with Qi coil"},
|
| 161 |
+
|
| 162 |
+
# ── Premium / Luxury ────────────────────────────────────────────────────
|
| 163 |
+
{"title": "Artisan Leather Weekender", "category": "premium", "price": 349.99,
|
| 164 |
+
"desc": "Hand-stitched full-grain vegetable-tanned leather duffle with brass YKK zippers. Develops a rich patina with age. Separate shoe compartment and detachable shoulder strap.",
|
| 165 |
+
"tags": ["leather", "handmade", "luxury", "travel"], "materials": "Full-grain vegetable-tanned leather, brass hardware"},
|
| 166 |
+
{"title": "Heritage Automatic Watch", "category": "premium", "price": 499.99,
|
| 167 |
+
"desc": "Swiss-movement automatic watch with a sapphire crystal dial and exhibition caseback. 42mm stainless steel case with a genuine alligator strap. 50-meter water resistance.",
|
| 168 |
+
"tags": ["automatic", "swiss-movement", "sapphire", "luxury"], "materials": "316L stainless steel, sapphire crystal, alligator leather strap"},
|
| 169 |
+
{"title": "Cashmere Blend Overcoat", "category": "premium", "price": 389.99,
|
| 170 |
+
"desc": "Italian-milled cashmere-wool blend overcoat with a notch lapel and half-canvas construction. Fully lined in Bemberg silk. Timeless silhouette for dressed-up or smart-casual looks.",
|
| 171 |
+
"tags": ["cashmere", "Italian", "luxury", "formal"], "materials": "70% wool, 30% cashmere, Bemberg lining"},
|
| 172 |
+
{"title": "Handcrafted Walnut Sunglasses", "category": "premium", "price": 179.99,
|
| 173 |
+
"desc": "Frames carved from sustainably sourced American black walnut with Carl Zeiss polarized lenses. Each pair has unique wood grain patterns. Spring hinges for a comfortable universal fit.",
|
| 174 |
+
"tags": ["handcrafted", "walnut", "polarized", "sustainable"], "materials": "Black walnut wood, Carl Zeiss polarized lenses"},
|
| 175 |
+
{"title": "Titanium Card Wallet", "category": "premium", "price": 89.99,
|
| 176 |
+
"desc": "Minimalist RFID-blocking wallet machined from grade-5 titanium. Holds 6 cards and features a quick-access pull tab. Weighs just 2.1 oz and will outlast any leather wallet.",
|
| 177 |
+
"tags": ["titanium", "RFID-blocking", "minimalist", "EDC"], "materials": "Grade-5 titanium, Dyneema pull tab"},
|
| 178 |
+
{"title": "Silk Pocket Square Collection", "category": "premium", "price": 59.99,
|
| 179 |
+
"desc": "Set of 3 hand-rolled Italian silk pocket squares in complementary patterns. Each square is individually wrapped in tissue — perfect as a gift or to elevate your suit game.",
|
| 180 |
+
"tags": ["silk", "Italian", "gift", "formal"], "materials": "100% Italian silk, hand-rolled edges"},
|
| 181 |
+
{"title": "Executive Leather Belt", "category": "premium", "price": 119.99,
|
| 182 |
+
"desc": "Single-piece full-grain bridle leather belt with a solid brass buckle. No stitching — the leather is thick enough to hold its shape for decades. Ages beautifully with wear.",
|
| 183 |
+
"tags": ["leather", "brass", "luxury", "classic"], "materials": "Full-grain English bridle leather, solid brass buckle"},
|
| 184 |
+
{"title": "Carbon Fiber Money Clip", "category": "premium", "price": 44.99,
|
| 185 |
+
"desc": "Aerospace-grade carbon fiber money clip with a satin finish. Ultra-lightweight and strong enough to hold 15+ folded bills without losing spring tension over time.",
|
| 186 |
+
"tags": ["carbon-fiber", "minimalist", "EDC", "lightweight"], "materials": "3K twill carbon fiber"},
|
| 187 |
+
|
| 188 |
+
# ── Home & Lifestyle ────────────────────────────────────────────────────
|
| 189 |
+
{"title": "Aromatherapy Soy Candle Set", "category": "home", "price": 36.99,
|
| 190 |
+
"desc": "Set of 3 hand-poured soy candles in amber glass jars: Lavender Fields, Cedar & Sage, and Vanilla Bean. 45-hour burn time each. Cotton wicks, no synthetic fragrances.",
|
| 191 |
+
"tags": ["soy", "aromatherapy", "handmade", "non-toxic"], "materials": "100% soy wax, cotton wicks, essential oils"},
|
| 192 |
+
{"title": "Japanese Ceramic Pour-Over Set", "category": "home", "price": 54.99,
|
| 193 |
+
"desc": "Minimalist pour-over coffee dripper with a double-wall ceramic server. The cone's spiral ribs allow optimal coffee bloom. Makes 2-4 cups of clean, nuanced brew.",
|
| 194 |
+
"tags": ["ceramic", "coffee", "Japanese", "minimalist"], "materials": "Hasami porcelain, borosilicate server"},
|
| 195 |
+
{"title": "Weighted Linen Throw Blanket", "category": "home", "price": 79.99,
|
| 196 |
+
"desc": "Stonewashed Belgian linen throw with a comfortable 3 lb weight. Gets softer with every wash. Perfect draped over a sofa or at the foot of the bed. OEKO-TEX certified.",
|
| 197 |
+
"tags": ["linen", "stonewashed", "cozy", "OEKO-TEX"], "materials": "100% Belgian flax linen"},
|
| 198 |
+
{"title": "Walnut & Brass Desk Organizer", "category": "home", "price": 44.99,
|
| 199 |
+
"desc": "Handcrafted desk organizer with solid walnut compartments and brass dividers. Holds pens, cards, phone, and small accessories. Felt-lined base protects desktop surfaces.",
|
| 200 |
+
"tags": ["walnut", "brass", "handcrafted", "office"], "materials": "American black walnut, brushed brass accents"},
|
| 201 |
+
{"title": "Terracotta Herb Planter Trio", "category": "home", "price": 29.99,
|
| 202 |
+
"desc": "Set of 3 terracotta planters with drainage holes and bamboo saucers. Perfect for kitchen windowsill herbs like basil, rosemary, and mint. Hand-finished with a matte glaze.",
|
| 203 |
+
"tags": ["terracotta", "gardening", "kitchen", "handmade"], "materials": "Terracotta clay, bamboo saucers"},
|
| 204 |
+
{"title": "Memory Foam Seat Cushion", "category": "home", "price": 39.99,
|
| 205 |
+
"desc": "Ergonomic U-shaped seat cushion with cooling gel-infused memory foam. Reduces tailbone pressure during long work sessions. Machine-washable velour cover with anti-slip bottom.",
|
| 206 |
+
"tags": ["ergonomic", "memory-foam", "office", "comfort"], "materials": "Gel-infused memory foam, velour cover"},
|
| 207 |
+
{"title": "Minimalist Wall Clock", "category": "home", "price": 49.99,
|
| 208 |
+
"desc": "12-inch silent-sweep wall clock with a birch plywood face and brass hands. No ticking sound — uses a precision quartz movement. Mounts flush with a single nail.",
|
| 209 |
+
"tags": ["minimalist", "silent", "birch", "Scandinavian"], "materials": "Baltic birch plywood, brass hands, quartz movement"},
|
| 210 |
+
{"title": "Turkish Cotton Bath Towel Set", "category": "home", "price": 64.99,
|
| 211 |
+
"desc": "Set of 4 Turkish cotton towels — 2 bath, 2 hand. Long-staple cotton loops absorb 3x their weight in water. Gets fluffier with each wash. OEKO-TEX Standard 100.",
|
| 212 |
+
"tags": ["Turkish-cotton", "absorbent", "OEKO-TEX", "bath"], "materials": "100% long-staple Turkish cotton"},
|
| 213 |
+
|
| 214 |
+
# ── Casual / Streetwear ─────────────────────────────────────────────────
|
| 215 |
+
{"title": "Urban Canvas Sneakers", "category": "casual", "price": 59.99,
|
| 216 |
+
"desc": "Classic low-top canvas sneakers with a vulcanized rubber sole for all-day comfort. Metal eyelets, cotton laces, and a removable cushioned insole. Comes in 8 colorways.",
|
| 217 |
+
"tags": ["canvas", "classic", "casual", "street"], "materials": "Organic cotton canvas, vulcanized rubber sole"},
|
| 218 |
+
{"title": "Oversized Graphic Hoodie", "category": "casual", "price": 54.99,
|
| 219 |
+
"desc": "Heavyweight 14 oz French terry hoodie with a relaxed oversized fit. Abstract graphic screen-printed with water-based inks. Ribbed cuffs, kangaroo pocket, and a double-lined hood.",
|
| 220 |
+
"tags": ["hoodie", "oversized", "streetwear", "graphic"], "materials": "80% cotton, 20% polyester French terry"},
|
| 221 |
+
{"title": "Slim Fit Chino Pants", "category": "casual", "price": 49.99,
|
| 222 |
+
"desc": "Tailored slim-fit chinos in a stretch twill that moves with you. Sits at the natural waist with a tapered leg. Works equally well with sneakers or loafers.",
|
| 223 |
+
"tags": ["chinos", "slim-fit", "stretch", "versatile"], "materials": "98% cotton, 2% elastane twill"},
|
| 224 |
+
{"title": "Vintage Wash Denim Jacket", "category": "casual", "price": 79.99,
|
| 225 |
+
"desc": "Classic trucker jacket in a medium-wash selvedge denim. Chest flap pockets, adjustable waist tabs, and copper-tone buttons. The perfect layering piece for spring and fall.",
|
| 226 |
+
"tags": ["denim", "trucker", "vintage", "layering"], "materials": "100% selvedge cotton denim"},
|
| 227 |
+
{"title": "Everyday Crossbody Bag", "category": "casual", "price": 34.99,
|
| 228 |
+
"desc": "Compact crossbody bag with an adjustable strap, front zip pocket, and RFID-protected main compartment. Fits phone, wallet, keys, and a small water bottle. Weighs 8 oz.",
|
| 229 |
+
"tags": ["crossbody", "RFID", "compact", "everyday"], "materials": "Water-resistant nylon, YKK zippers"},
|
| 230 |
+
{"title": "Bamboo Fiber Crew Socks 6-Pack", "category": "casual", "price": 24.99,
|
| 231 |
+
"desc": "Ultra-soft bamboo fiber socks with natural antibacterial and moisture-wicking properties. Reinforced heel and toe, seamless toe closure, and a mid-calf height. Fits sizes 6–12.",
|
| 232 |
+
"tags": ["bamboo", "antibacterial", "moisture-wicking", "comfort"], "materials": "70% bamboo viscose, 25% cotton, 5% elastane"},
|
| 233 |
+
{"title": "Relaxed Linen Drawstring Pants", "category": "casual", "price": 44.99,
|
| 234 |
+
"desc": "Breezy linen pants with an elastic drawstring waist and side pockets. Perfect for beach vacations, weekend errands, or just lounging at home. Gets softer with every wash.",
|
| 235 |
+
"tags": ["linen", "relaxed", "breathable", "vacation"], "materials": "100% pre-washed linen"},
|
| 236 |
+
{"title": "Retro Aviator Sunglasses", "category": "casual", "price": 29.99,
|
| 237 |
+
"desc": "Classic aviator frames in brushed gold metal with gradient smoke lenses. UV400 protection, adjustable nose pads, and spring-loaded temples for a comfortable fit.",
|
| 238 |
+
"tags": ["aviator", "UV400", "retro", "metal-frame"], "materials": "Brushed metal alloy, gradient polycarbonate lenses"},
|
| 239 |
+
]
|
| 240 |
+
|
| 241 |
+
|
| 242 |
+
def _expand_catalog(templates: list[dict], target_count: int = 200) -> list[Product]:
|
| 243 |
+
"""
|
| 244 |
+
Expand hand-authored templates into a full catalog by adding tasteful
|
| 245 |
+
variations (color/version suffixes) while preserving description richness.
|
| 246 |
+
"""
|
| 247 |
+
catalog: list[Product] = []
|
| 248 |
+
color_variants = [
|
| 249 |
+
"Charcoal", "Midnight Blue", "Forest Green", "Stone Grey",
|
| 250 |
+
"Rust Orange", "Ivory", "Slate", "Obsidian",
|
| 251 |
+
]
|
| 252 |
+
idx = 1
|
| 253 |
+
variant_idx = 0
|
| 254 |
+
|
| 255 |
+
while len(catalog) < target_count:
|
| 256 |
+
for tmpl in templates:
|
| 257 |
+
if len(catalog) >= target_count:
|
| 258 |
+
break
|
| 259 |
+
|
| 260 |
+
if variant_idx == 0:
|
| 261 |
+
title = tmpl["title"]
|
| 262 |
+
else:
|
| 263 |
+
color = color_variants[variant_idx % len(color_variants)]
|
| 264 |
+
title = f"{tmpl['title']} — {color}"
|
| 265 |
+
|
| 266 |
+
catalog.append(Product(
|
| 267 |
+
id=idx,
|
| 268 |
+
title=title,
|
| 269 |
+
category=tmpl["category"],
|
| 270 |
+
price=round(tmpl["price"] * (1 + random.uniform(-0.08, 0.08)), 2),
|
| 271 |
+
desc=tmpl["desc"],
|
| 272 |
+
tags=list(tmpl["tags"]),
|
| 273 |
+
rating=round(random.uniform(3.8, 5.0), 1),
|
| 274 |
+
reviews=random.randint(12, 2400),
|
| 275 |
+
materials=tmpl["materials"],
|
| 276 |
+
))
|
| 277 |
+
idx += 1
|
| 278 |
+
variant_idx += 1
|
| 279 |
+
|
| 280 |
+
return catalog
|
| 281 |
+
|
| 282 |
+
|
| 283 |
+
def generate_catalog() -> list[Product]:
|
| 284 |
+
"""Generate the full product catalog."""
|
| 285 |
+
return _expand_catalog(_TEMPLATES, target_count=200)
|
| 286 |
+
|
| 287 |
+
|
| 288 |
+
def get_scenarios() -> dict[str, list[str]]:
|
| 289 |
+
"""
|
| 290 |
+
Pre-built query sequences that demonstrate drift detection and
|
| 291 |
+
self-healing adaptation in a recruiter demo.
|
| 292 |
+
"""
|
| 293 |
+
return {
|
| 294 |
+
"🟢 Phase 1 · Normal": [
|
| 295 |
+
"I need a good water bottle for hiking.",
|
| 296 |
+
"Looking for comfortable running shoes.",
|
| 297 |
+
"Can you recommend a fitness watch with GPS?",
|
| 298 |
+
"What kind of bags do you have for travel?",
|
| 299 |
+
],
|
| 300 |
+
"🔴 Phase 2 · Black Friday": [
|
| 301 |
+
"What's the absolute cheapest winter hat you have?",
|
| 302 |
+
"Any bags under $25?",
|
| 303 |
+
"Show me the most budget-friendly options.",
|
| 304 |
+
"I only have $30 to spend, what can I get?",
|
| 305 |
+
],
|
| 306 |
+
"☀️ Phase 3 · Summer Shift": [
|
| 307 |
+
"Do you have lightweight sandals for the beach?",
|
| 308 |
+
"I need breathable clothes for hot weather.",
|
| 309 |
+
"Looking for UV protection sunglasses.",
|
| 310 |
+
"Recommend summer vacation essentials.",
|
| 311 |
+
],
|
| 312 |
+
"🌿 Phase 4 · Eco Trend": [
|
| 313 |
+
"Show me products made from recycled materials.",
|
| 314 |
+
"I only want sustainable, eco-friendly options.",
|
| 315 |
+
"Do you have anything organic or plant-based?",
|
| 316 |
+
"What's your most environmentally responsible product?",
|
| 317 |
+
],
|
| 318 |
+
}
|
modules/drift.py
ADDED
|
@@ -0,0 +1,153 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Semantic drift detector for RetailMind.
|
| 3 |
+
|
| 4 |
+
Tracks the rolling semantic similarity of incoming user queries against
|
| 5 |
+
predefined *concept anchors* (e.g., price-sensitivity, seasonal shift,
|
| 6 |
+
eco-trend). When the exponentially-weighted moving average for any concept
|
| 7 |
+
exceeds a configurable threshold the system flags an active drift — which
|
| 8 |
+
triggers the self-healing adapter to rewrite the LLM system prompt.
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
from __future__ import annotations
|
| 12 |
+
|
| 13 |
+
import logging
|
| 14 |
+
import time
|
| 15 |
+
from dataclasses import dataclass, field
|
| 16 |
+
from typing import Any
|
| 17 |
+
|
| 18 |
+
import numpy as np
|
| 19 |
+
from sentence_transformers import SentenceTransformer
|
| 20 |
+
|
| 21 |
+
logger = logging.getLogger(__name__)
|
| 22 |
+
|
| 23 |
+
# Use shared model instance across retriever & drift detector
|
| 24 |
+
_shared_model: SentenceTransformer | None = None
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
def _get_model() -> SentenceTransformer:
|
| 28 |
+
global _shared_model
|
| 29 |
+
if _shared_model is None:
|
| 30 |
+
_shared_model = SentenceTransformer("all-MiniLM-L6-v2")
|
| 31 |
+
return _shared_model
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
@dataclass
|
| 35 |
+
class DriftEvent:
|
| 36 |
+
"""Immutable record of a single drift measurement."""
|
| 37 |
+
|
| 38 |
+
timestamp: float
|
| 39 |
+
query: str
|
| 40 |
+
scores: dict[str, float]
|
| 41 |
+
dominant: str
|
| 42 |
+
|
| 43 |
+
|
| 44 |
+
@dataclass
|
| 45 |
+
class DriftDetector:
|
| 46 |
+
"""
|
| 47 |
+
Monitors semantic drift across configurable concept anchors.
|
| 48 |
+
|
| 49 |
+
Uses EWMA (exponentially weighted moving average) to smooth noisy
|
| 50 |
+
single-query scores into stable trend signals.
|
| 51 |
+
"""
|
| 52 |
+
|
| 53 |
+
threshold: float = 0.38
|
| 54 |
+
ewma_alpha: float = 0.35 # smoothing factor (higher = more reactive)
|
| 55 |
+
history: list[DriftEvent] = field(default_factory=list)
|
| 56 |
+
_ewma: dict[str, float] = field(default_factory=dict)
|
| 57 |
+
_concept_embs: dict[str, Any] = field(default_factory=dict, repr=False)
|
| 58 |
+
|
| 59 |
+
def __post_init__(self) -> None:
|
| 60 |
+
model = _get_model()
|
| 61 |
+
# Multiple anchor phrases per concept → averaged embedding for robustness
|
| 62 |
+
concept_phrases = {
|
| 63 |
+
"price_sensitive": [
|
| 64 |
+
"cheap budget discount low price clearance sale savings affordable",
|
| 65 |
+
"what is the cheapest option under twenty dollars bargain deal",
|
| 66 |
+
"I only have a limited budget, show me value picks",
|
| 67 |
+
],
|
| 68 |
+
"summer_shift": [
|
| 69 |
+
"summer heat warm weather sandals shorts sunscreen beach",
|
| 70 |
+
"lightweight breathable sun protection hot climate UV",
|
| 71 |
+
"vacation tropical poolside outdoor warm temperature",
|
| 72 |
+
],
|
| 73 |
+
"eco_trend": [
|
| 74 |
+
"eco-friendly sustainable organic recycled environment green",
|
| 75 |
+
"plant-based carbon-neutral zero waste biodegradable vegan",
|
| 76 |
+
"responsible sourcing ethical production renewable materials",
|
| 77 |
+
],
|
| 78 |
+
}
|
| 79 |
+
for concept, phrases in concept_phrases.items():
|
| 80 |
+
embs = model.encode(phrases, show_progress_bar=False)
|
| 81 |
+
self._concept_embs[concept] = np.mean(embs, axis=0)
|
| 82 |
+
self._ewma[concept] = 0.0
|
| 83 |
+
|
| 84 |
+
logger.info("DriftDetector initialized with %d concept anchors.", len(concept_phrases))
|
| 85 |
+
|
| 86 |
+
# ── Public API ──────────────────────────────────────────────────────────
|
| 87 |
+
|
| 88 |
+
def analyze_drift(self, query: str) -> tuple[str, dict[str, float]]:
|
| 89 |
+
"""
|
| 90 |
+
Score *query* against all concept anchors and return
|
| 91 |
+
``(dominant_concept, raw_scores)``.
|
| 92 |
+
"""
|
| 93 |
+
model = _get_model()
|
| 94 |
+
query_emb = model.encode([query], show_progress_bar=False)[0]
|
| 95 |
+
|
| 96 |
+
raw_scores: dict[str, float] = {}
|
| 97 |
+
for concept, ref_emb in self._concept_embs.items():
|
| 98 |
+
sim = float(
|
| 99 |
+
np.dot(query_emb, ref_emb)
|
| 100 |
+
/ (np.linalg.norm(query_emb) * np.linalg.norm(ref_emb) + 1e-10)
|
| 101 |
+
)
|
| 102 |
+
raw_scores[concept] = sim
|
| 103 |
+
|
| 104 |
+
# Update EWMA
|
| 105 |
+
prev = self._ewma[concept]
|
| 106 |
+
self._ewma[concept] = self.ewma_alpha * sim + (1 - self.ewma_alpha) * prev
|
| 107 |
+
|
| 108 |
+
# Determine dominant drift from smoothed signal
|
| 109 |
+
detected = "normal"
|
| 110 |
+
max_smoothed = 0.0
|
| 111 |
+
for concept, smoothed in self._ewma.items():
|
| 112 |
+
if smoothed > self.threshold and smoothed > max_smoothed:
|
| 113 |
+
max_smoothed = smoothed
|
| 114 |
+
detected = concept
|
| 115 |
+
|
| 116 |
+
event = DriftEvent(
|
| 117 |
+
timestamp=time.time(),
|
| 118 |
+
query=query,
|
| 119 |
+
scores=raw_scores,
|
| 120 |
+
dominant=detected,
|
| 121 |
+
)
|
| 122 |
+
self.history.append(event)
|
| 123 |
+
if len(self.history) > 200:
|
| 124 |
+
self.history = self.history[-200:]
|
| 125 |
+
|
| 126 |
+
logger.debug("Drift analysis: %s | scores=%s | ewma=%s", detected, raw_scores, self._ewma)
|
| 127 |
+
return detected, raw_scores
|
| 128 |
+
|
| 129 |
+
def get_ewma_scores(self) -> dict[str, float]:
|
| 130 |
+
"""Return current EWMA-smoothed scores for dashboard display."""
|
| 131 |
+
return dict(self._ewma)
|
| 132 |
+
|
| 133 |
+
def get_recent_stats(self) -> dict[str, float] | None:
|
| 134 |
+
"""Return averaged raw scores from last N queries."""
|
| 135 |
+
if not self.history:
|
| 136 |
+
return None
|
| 137 |
+
recent = self.history[-5:]
|
| 138 |
+
concepts = list(self._concept_embs.keys())
|
| 139 |
+
return {
|
| 140 |
+
c: float(np.mean([e.scores[c] for e in recent]))
|
| 141 |
+
for c in concepts
|
| 142 |
+
}
|
| 143 |
+
|
| 144 |
+
def get_history_series(self) -> dict[str, list[float]]:
|
| 145 |
+
"""Return full EWMA time-series for each concept (for charts)."""
|
| 146 |
+
# Recompute from history for accurate display
|
| 147 |
+
series: dict[str, list[float]] = {c: [] for c in self._concept_embs}
|
| 148 |
+
ewma_state = {c: 0.0 for c in self._concept_embs}
|
| 149 |
+
for event in self.history:
|
| 150 |
+
for c in self._concept_embs:
|
| 151 |
+
ewma_state[c] = self.ewma_alpha * event.scores[c] + (1 - self.ewma_alpha) * ewma_state[c]
|
| 152 |
+
series[c].append(ewma_state[c])
|
| 153 |
+
return series
|
modules/llm.py
ADDED
|
@@ -0,0 +1,95 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Local LLM inference engine for RetailMind.
|
| 3 |
+
|
| 4 |
+
Uses Qwen2.5-0.5B-Instruct running entirely on CPU — no API keys, no GPU,
|
| 5 |
+
no external dependencies. Prompt engineering is tuned to minimize
|
| 6 |
+
hallucination by grounding all answers in the provided product context.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from __future__ import annotations
|
| 10 |
+
|
| 11 |
+
import logging
|
| 12 |
+
import time
|
| 13 |
+
from typing import Any
|
| 14 |
+
|
| 15 |
+
import torch
|
| 16 |
+
from transformers import pipeline
|
| 17 |
+
|
| 18 |
+
logger = logging.getLogger(__name__)
|
| 19 |
+
|
| 20 |
+
_generator = None
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
def _get_pipeline():
|
| 24 |
+
"""Lazy-load the text-generation pipeline (singleton)."""
|
| 25 |
+
global _generator
|
| 26 |
+
if _generator is None:
|
| 27 |
+
logger.info("Loading Qwen2.5-0.5B-Instruct on CPU (first call only)…")
|
| 28 |
+
t0 = time.time()
|
| 29 |
+
_generator = pipeline(
|
| 30 |
+
"text-generation",
|
| 31 |
+
model="Qwen/Qwen2.5-0.5B-Instruct",
|
| 32 |
+
device="cpu",
|
| 33 |
+
torch_dtype=torch.float32,
|
| 34 |
+
)
|
| 35 |
+
logger.info("Model loaded in %.1fs", time.time() - t0)
|
| 36 |
+
return _generator
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
def generate_response(
|
| 40 |
+
system_prompt: str,
|
| 41 |
+
user_query: str,
|
| 42 |
+
retrieved_items: list[dict[str, Any]],
|
| 43 |
+
) -> str:
|
| 44 |
+
"""
|
| 45 |
+
Generate a grounded product recommendation.
|
| 46 |
+
|
| 47 |
+
The retrieved items are injected directly into the system prompt so
|
| 48 |
+
the model can only reference real products.
|
| 49 |
+
"""
|
| 50 |
+
# Build structured context from retrieved products
|
| 51 |
+
context_lines = []
|
| 52 |
+
for i, r in enumerate(retrieved_items, 1):
|
| 53 |
+
p = r["product"]
|
| 54 |
+
stars = "★" * int(p.get("rating", 4)) + "☆" * (5 - int(p.get("rating", 4)))
|
| 55 |
+
context_lines.append(
|
| 56 |
+
f"{i}. {p['title']} — ${p['price']:.2f}\n"
|
| 57 |
+
f" Category: {p['category']} | Rating: {stars} ({p.get('reviews', 0)} reviews)\n"
|
| 58 |
+
f" Materials: {p.get('materials', 'N/A')}\n"
|
| 59 |
+
f" Description: {p['desc']}"
|
| 60 |
+
)
|
| 61 |
+
|
| 62 |
+
context = "\n\n".join(context_lines)
|
| 63 |
+
|
| 64 |
+
messages = [
|
| 65 |
+
{
|
| 66 |
+
"role": "system",
|
| 67 |
+
"content": (
|
| 68 |
+
f"{system_prompt}\n\n"
|
| 69 |
+
f"══════ Available Inventory ══════\n\n"
|
| 70 |
+
f"{context}\n\n"
|
| 71 |
+
f"══════════════════════════════════\n"
|
| 72 |
+
f"IMPORTANT: Only recommend from the products listed above. "
|
| 73 |
+
f"Cite exact names and prices."
|
| 74 |
+
),
|
| 75 |
+
},
|
| 76 |
+
{"role": "user", "content": user_query},
|
| 77 |
+
]
|
| 78 |
+
|
| 79 |
+
try:
|
| 80 |
+
gen = _get_pipeline()
|
| 81 |
+
result = gen(
|
| 82 |
+
messages,
|
| 83 |
+
max_new_tokens=250,
|
| 84 |
+
temperature=0.3,
|
| 85 |
+
do_sample=True,
|
| 86 |
+
top_p=0.9,
|
| 87 |
+
return_full_text=False,
|
| 88 |
+
)
|
| 89 |
+
generated = result[0]["generated_text"]
|
| 90 |
+
if isinstance(generated, list):
|
| 91 |
+
return generated[-1]["content"]
|
| 92 |
+
return generated
|
| 93 |
+
except Exception as e:
|
| 94 |
+
logger.exception("LLM inference failed")
|
| 95 |
+
return f"[RetailMind] I encountered an issue generating a response. Error: {e}"
|
modules/retrieval.py
ADDED
|
@@ -0,0 +1,150 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Hybrid retrieval engine for RetailMind.
|
| 3 |
+
|
| 4 |
+
Combines dense semantic search (SentenceTransformers) with structured
|
| 5 |
+
metadata filtering (price range, category, tags) so that queries like
|
| 6 |
+
"eco-friendly bag under $30" actually return relevant, correctly-priced items.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from __future__ import annotations
|
| 10 |
+
|
| 11 |
+
import logging
|
| 12 |
+
import re
|
| 13 |
+
from typing import Any
|
| 14 |
+
|
| 15 |
+
import numpy as np
|
| 16 |
+
from sentence_transformers import SentenceTransformer
|
| 17 |
+
|
| 18 |
+
logger = logging.getLogger(__name__)
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
class HybridRetriever:
|
| 22 |
+
"""Two-stage retriever: metadata pre-filter → semantic re-rank."""
|
| 23 |
+
|
| 24 |
+
def __init__(self, catalog: list[dict]) -> None:
|
| 25 |
+
self.catalog = catalog
|
| 26 |
+
self.model = SentenceTransformer("all-MiniLM-L6-v2")
|
| 27 |
+
|
| 28 |
+
# Build rich embedding texts that capture all searchable facets
|
| 29 |
+
texts = [
|
| 30 |
+
(
|
| 31 |
+
f"{p['title']}. {p['desc']} "
|
| 32 |
+
f"Category: {p['category']}. "
|
| 33 |
+
f"Materials: {p.get('materials', 'N/A')}. "
|
| 34 |
+
f"Tags: {', '.join(p.get('tags', []))}."
|
| 35 |
+
)
|
| 36 |
+
for p in catalog
|
| 37 |
+
]
|
| 38 |
+
logger.info("Encoding %d products…", len(catalog))
|
| 39 |
+
self.embeddings = self.model.encode(texts, show_progress_bar=False)
|
| 40 |
+
self._norms = np.linalg.norm(self.embeddings, axis=1)
|
| 41 |
+
logger.info("Catalog indexed successfully.")
|
| 42 |
+
|
| 43 |
+
# ── Public API ──────────────────────────────────────────────────────────
|
| 44 |
+
|
| 45 |
+
def search(
|
| 46 |
+
self,
|
| 47 |
+
query: str,
|
| 48 |
+
top_k: int = 4,
|
| 49 |
+
category_filter: str | None = None,
|
| 50 |
+
) -> list[dict[str, Any]]:
|
| 51 |
+
"""
|
| 52 |
+
Retrieve top-k products for *query*.
|
| 53 |
+
|
| 54 |
+
Pipeline:
|
| 55 |
+
1. Extract price ceiling from natural language (e.g. "under $50").
|
| 56 |
+
2. Pre-filter catalog by price / category if applicable.
|
| 57 |
+
3. Rank remaining items by cosine similarity.
|
| 58 |
+
4. Return top-k with scores.
|
| 59 |
+
"""
|
| 60 |
+
price_cap = self._extract_price_cap(query)
|
| 61 |
+
cat_hint = category_filter or self._extract_category_hint(query)
|
| 62 |
+
|
| 63 |
+
# Stage 1 — metadata pre-filter
|
| 64 |
+
candidate_indices = self._prefilter(price_cap, cat_hint)
|
| 65 |
+
|
| 66 |
+
# Stage 2 — semantic ranking over candidates
|
| 67 |
+
query_emb = self.model.encode([query], show_progress_bar=False)[0]
|
| 68 |
+
query_norm = np.linalg.norm(query_emb)
|
| 69 |
+
|
| 70 |
+
if len(candidate_indices) == 0:
|
| 71 |
+
# Fallback: rank entire catalog if filters yield nothing
|
| 72 |
+
candidate_indices = list(range(len(self.catalog)))
|
| 73 |
+
|
| 74 |
+
cand_embs = self.embeddings[candidate_indices]
|
| 75 |
+
cand_norms = self._norms[candidate_indices]
|
| 76 |
+
|
| 77 |
+
scores = np.dot(cand_embs, query_emb) / (cand_norms * query_norm + 1e-10)
|
| 78 |
+
top_local = np.argsort(scores)[::-1][:top_k]
|
| 79 |
+
|
| 80 |
+
results = []
|
| 81 |
+
for li in top_local:
|
| 82 |
+
global_idx = candidate_indices[li]
|
| 83 |
+
results.append({
|
| 84 |
+
"product": self.catalog[global_idx],
|
| 85 |
+
"score": float(scores[li]),
|
| 86 |
+
})
|
| 87 |
+
|
| 88 |
+
logger.debug(
|
| 89 |
+
"Query: %r | price_cap=%s | cat=%s | candidates=%d | top=%d",
|
| 90 |
+
query, price_cap, cat_hint, len(candidate_indices), len(results),
|
| 91 |
+
)
|
| 92 |
+
return results
|
| 93 |
+
|
| 94 |
+
# ── Private helpers ─────────────────────────────────────────────────────
|
| 95 |
+
|
| 96 |
+
@staticmethod
|
| 97 |
+
def _extract_price_cap(query: str) -> float | None:
|
| 98 |
+
"""Parse 'under $50', 'below 30', 'less than $25', 'budget' etc."""
|
| 99 |
+
patterns = [
|
| 100 |
+
r"under\s*\$?\s*(\d+(?:\.\d+)?)",
|
| 101 |
+
r"below\s*\$?\s*(\d+(?:\.\d+)?)",
|
| 102 |
+
r"less\s+than\s*\$?\s*(\d+(?:\.\d+)?)",
|
| 103 |
+
r"cheaper\s+than\s*\$?\s*(\d+(?:\.\d+)?)",
|
| 104 |
+
r"max(?:imum)?\s*\$?\s*(\d+(?:\.\d+)?)",
|
| 105 |
+
r"\$(\d+(?:\.\d+)?)\s*(?:or\s+less|max|budget)",
|
| 106 |
+
r"only\s+have\s*\$?\s*(\d+)",
|
| 107 |
+
r"(?:spend|budget)\s*(?:of|is)?\s*\$?\s*(\d+)",
|
| 108 |
+
]
|
| 109 |
+
for pat in patterns:
|
| 110 |
+
m = re.search(pat, query, re.IGNORECASE)
|
| 111 |
+
if m:
|
| 112 |
+
return float(m.group(1))
|
| 113 |
+
|
| 114 |
+
# Heuristic: very budget-oriented queries
|
| 115 |
+
budget_keywords = {"cheapest", "budget", "affordable", "inexpensive", "bargain"}
|
| 116 |
+
if any(kw in query.lower() for kw in budget_keywords):
|
| 117 |
+
return 50.0 # Reasonable default budget ceiling
|
| 118 |
+
|
| 119 |
+
return None
|
| 120 |
+
|
| 121 |
+
def _extract_category_hint(self, query: str) -> str | None:
|
| 122 |
+
"""Map common query terms to catalog categories."""
|
| 123 |
+
category_keywords: dict[str, list[str]] = {
|
| 124 |
+
"winter": ["winter", "cold", "snow", "warm", "insulated", "thermal"],
|
| 125 |
+
"summer": ["summer", "beach", "hot", "heat", "sun", "warm weather"],
|
| 126 |
+
"eco-friendly": ["eco", "sustainable", "organic", "recycled", "green", "environment", "plant-based"],
|
| 127 |
+
"sports": ["sport", "fitness", "running", "gym", "training", "workout", "athletic"],
|
| 128 |
+
"electronics": ["tech", "electronic", "gadget", "headphone", "speaker", "charger", "smart"],
|
| 129 |
+
"premium": ["luxury", "premium", "high-end", "designer", "artisan"],
|
| 130 |
+
"home": ["home", "kitchen", "desk", "candle", "bath", "decor"],
|
| 131 |
+
"casual": ["casual", "streetwear", "everyday", "hoodie", "sneaker", "jeans"],
|
| 132 |
+
}
|
| 133 |
+
q_lower = query.lower()
|
| 134 |
+
for cat, keywords in category_keywords.items():
|
| 135 |
+
if any(kw in q_lower for kw in keywords):
|
| 136 |
+
return cat
|
| 137 |
+
return None
|
| 138 |
+
|
| 139 |
+
def _prefilter(
|
| 140 |
+
self, price_cap: float | None, category: str | None
|
| 141 |
+
) -> list[int]:
|
| 142 |
+
"""Return indices of products matching hard constraints."""
|
| 143 |
+
indices = []
|
| 144 |
+
for i, p in enumerate(self.catalog):
|
| 145 |
+
if price_cap is not None and p["price"] > price_cap:
|
| 146 |
+
continue
|
| 147 |
+
if category is not None and p["category"] != category:
|
| 148 |
+
continue
|
| 149 |
+
indices.append(i)
|
| 150 |
+
return indices
|
requirements.txt
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
gradio>=4.0.0
|
| 2 |
+
transformers
|
| 3 |
+
torch
|
| 4 |
+
sentence-transformers
|
| 5 |
+
huggingface_hub
|
| 6 |
+
python-dotenv
|
| 7 |
+
plotly
|
| 8 |
+
numpy
|
tests/__init__.py
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
# Tests package
|
tests/test_adaptation.py
ADDED
|
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Unit tests for the self-healing adapter.
|
| 3 |
+
"""
|
| 4 |
+
|
| 5 |
+
import pytest
|
| 6 |
+
from modules.adaptation import Adapter
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
@pytest.fixture
|
| 10 |
+
def adapter():
|
| 11 |
+
return Adapter()
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
class TestAdapter:
|
| 15 |
+
"""Tests for the prompt adaptation engine."""
|
| 16 |
+
|
| 17 |
+
def test_normal_returns_base_prompt(self, adapter):
|
| 18 |
+
prompt = adapter.adapt_prompt("normal")
|
| 19 |
+
assert "RetailMind" in prompt
|
| 20 |
+
assert "ACTIVE ADAPTATION" not in prompt
|
| 21 |
+
|
| 22 |
+
def test_price_sensitive_injects_rules(self, adapter):
|
| 23 |
+
prompt = adapter.adapt_prompt("price_sensitive")
|
| 24 |
+
assert "PRICE SENSITIVITY" in prompt
|
| 25 |
+
assert "cheapest" in prompt.lower()
|
| 26 |
+
|
| 27 |
+
def test_summer_shift_injects_rules(self, adapter):
|
| 28 |
+
prompt = adapter.adapt_prompt("summer_shift")
|
| 29 |
+
assert "SEASONAL SHIFT" in prompt
|
| 30 |
+
assert "lightweight" in prompt.lower()
|
| 31 |
+
|
| 32 |
+
def test_eco_trend_injects_rules(self, adapter):
|
| 33 |
+
prompt = adapter.adapt_prompt("eco_trend")
|
| 34 |
+
assert "SUSTAINABILITY" in prompt
|
| 35 |
+
assert "recycled" in prompt.lower() or "organic" in prompt.lower()
|
| 36 |
+
|
| 37 |
+
def test_explanation_differs_per_state(self, adapter):
|
| 38 |
+
explanations = set()
|
| 39 |
+
for state in ["normal", "price_sensitive", "summer_shift", "eco_trend"]:
|
| 40 |
+
explanations.add(adapter.get_explanation(state))
|
| 41 |
+
assert len(explanations) == 4, "Each state should produce a unique explanation"
|
| 42 |
+
|
| 43 |
+
def test_label_differs_per_state(self, adapter):
|
| 44 |
+
labels = set()
|
| 45 |
+
for state in ["normal", "price_sensitive", "summer_shift", "eco_trend"]:
|
| 46 |
+
labels.add(adapter.get_label(state))
|
| 47 |
+
assert len(labels) == 4, "Each state should produce a unique label"
|
| 48 |
+
|
| 49 |
+
def test_base_prompt_contains_anti_hallucination(self, adapter):
|
| 50 |
+
prompt = adapter.adapt_prompt("normal")
|
| 51 |
+
assert "ONLY recommend" in prompt or "only recommend" in prompt.lower()
|
tests/test_catalog.py
ADDED
|
@@ -0,0 +1,50 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Unit tests for RetailMind core modules.
|
| 3 |
+
|
| 4 |
+
Run with: pytest tests/ -v
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
import pytest
|
| 8 |
+
from modules.data_simulation import generate_catalog, get_scenarios
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
class TestCatalog:
|
| 12 |
+
"""Tests for the product catalog generator."""
|
| 13 |
+
|
| 14 |
+
def test_catalog_size(self):
|
| 15 |
+
catalog = generate_catalog()
|
| 16 |
+
assert len(catalog) == 200, f"Expected 200 products, got {len(catalog)}"
|
| 17 |
+
|
| 18 |
+
def test_product_has_required_fields(self):
|
| 19 |
+
catalog = generate_catalog()
|
| 20 |
+
required = {"id", "title", "category", "price", "desc", "tags", "rating", "reviews", "materials"}
|
| 21 |
+
for p in catalog[:5]:
|
| 22 |
+
missing = required - set(p.keys())
|
| 23 |
+
assert not missing, f"Product {p['id']} missing fields: {missing}"
|
| 24 |
+
|
| 25 |
+
def test_prices_are_positive(self):
|
| 26 |
+
catalog = generate_catalog()
|
| 27 |
+
for p in catalog:
|
| 28 |
+
assert p["price"] > 0, f"Product {p['id']} has non-positive price: {p['price']}"
|
| 29 |
+
|
| 30 |
+
def test_ratings_in_range(self):
|
| 31 |
+
catalog = generate_catalog()
|
| 32 |
+
for p in catalog:
|
| 33 |
+
assert 1.0 <= p["rating"] <= 5.0, f"Product {p['id']} has invalid rating: {p['rating']}"
|
| 34 |
+
|
| 35 |
+
def test_categories_are_valid(self):
|
| 36 |
+
valid = {"winter", "summer", "eco-friendly", "sports", "electronics", "premium", "home", "casual"}
|
| 37 |
+
catalog = generate_catalog()
|
| 38 |
+
for p in catalog:
|
| 39 |
+
assert p["category"] in valid, f"Invalid category: {p['category']}"
|
| 40 |
+
|
| 41 |
+
def test_unique_ids(self):
|
| 42 |
+
catalog = generate_catalog()
|
| 43 |
+
ids = [p["id"] for p in catalog]
|
| 44 |
+
assert len(ids) == len(set(ids)), "Duplicate product IDs found"
|
| 45 |
+
|
| 46 |
+
def test_scenarios_not_empty(self):
|
| 47 |
+
scenarios = get_scenarios()
|
| 48 |
+
assert len(scenarios) >= 4, "Expected at least 4 scenario phases"
|
| 49 |
+
for name, queries in scenarios.items():
|
| 50 |
+
assert len(queries) >= 3, f"Scenario '{name}' has too few queries"
|
tests/test_drift.py
ADDED
|
@@ -0,0 +1,59 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Unit tests for the drift detection module.
|
| 3 |
+
"""
|
| 4 |
+
|
| 5 |
+
import pytest
|
| 6 |
+
from modules.drift import DriftDetector
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
@pytest.fixture
|
| 10 |
+
def detector():
|
| 11 |
+
return DriftDetector()
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
class TestDriftDetector:
|
| 15 |
+
"""Tests for semantic drift detection."""
|
| 16 |
+
|
| 17 |
+
def test_normal_query_no_drift(self, detector):
|
| 18 |
+
drift, scores = detector.analyze_drift("I need a good water bottle.")
|
| 19 |
+
assert drift == "normal", f"Expected 'normal', got '{drift}'"
|
| 20 |
+
assert all(isinstance(v, float) for v in scores.values())
|
| 21 |
+
|
| 22 |
+
def test_price_sensitive_detection(self, detector):
|
| 23 |
+
# Feed multiple budget-oriented queries to build up EWMA
|
| 24 |
+
for q in ["cheapest option", "budget under $20", "show me the cheapest"]:
|
| 25 |
+
drift, _ = detector.analyze_drift(q)
|
| 26 |
+
assert drift == "price_sensitive", f"Expected 'price_sensitive' after budget queries, got '{drift}'"
|
| 27 |
+
|
| 28 |
+
def test_eco_trend_detection(self, detector):
|
| 29 |
+
for q in ["sustainable organic products", "eco-friendly recycled", "I want plant-based items"]:
|
| 30 |
+
drift, _ = detector.analyze_drift(q)
|
| 31 |
+
assert drift == "eco_trend", f"Expected 'eco_trend' after eco queries, got '{drift}'"
|
| 32 |
+
|
| 33 |
+
def test_summer_shift_detection(self, detector):
|
| 34 |
+
for q in ["summer beach sandals", "hot weather lightweight", "UV protection for sun"]:
|
| 35 |
+
drift, _ = detector.analyze_drift(q)
|
| 36 |
+
assert drift == "summer_shift", f"Expected 'summer_shift' after summer queries, got '{drift}'"
|
| 37 |
+
|
| 38 |
+
def test_scores_have_all_concepts(self, detector):
|
| 39 |
+
_, scores = detector.analyze_drift("test query")
|
| 40 |
+
expected = {"price_sensitive", "summer_shift", "eco_trend"}
|
| 41 |
+
assert set(scores.keys()) == expected
|
| 42 |
+
|
| 43 |
+
def test_history_accumulates(self, detector):
|
| 44 |
+
for i in range(5):
|
| 45 |
+
detector.analyze_drift(f"query {i}")
|
| 46 |
+
assert len(detector.history) == 5
|
| 47 |
+
|
| 48 |
+
def test_ewma_scores_available(self, detector):
|
| 49 |
+
detector.analyze_drift("some query")
|
| 50 |
+
ewma = detector.get_ewma_scores()
|
| 51 |
+
assert isinstance(ewma, dict)
|
| 52 |
+
assert len(ewma) == 3
|
| 53 |
+
|
| 54 |
+
def test_history_series_length_matches(self, detector):
|
| 55 |
+
for i in range(10):
|
| 56 |
+
detector.analyze_drift(f"query {i}")
|
| 57 |
+
series = detector.get_history_series()
|
| 58 |
+
for concept, data in series.items():
|
| 59 |
+
assert len(data) == 10, f"{concept} series length {len(data)} != 10"
|
tests/test_retrieval.py
ADDED
|
@@ -0,0 +1,60 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Unit tests for hybrid retrieval engine.
|
| 3 |
+
"""
|
| 4 |
+
|
| 5 |
+
import pytest
|
| 6 |
+
from modules.data_simulation import generate_catalog
|
| 7 |
+
from modules.retrieval import HybridRetriever
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
@pytest.fixture(scope="module")
|
| 11 |
+
def retriever():
|
| 12 |
+
catalog = generate_catalog()
|
| 13 |
+
return HybridRetriever(catalog)
|
| 14 |
+
|
| 15 |
+
|
| 16 |
+
class TestHybridRetriever:
|
| 17 |
+
"""Tests for the hybrid retrieval system."""
|
| 18 |
+
|
| 19 |
+
def test_returns_correct_count(self, retriever):
|
| 20 |
+
results = retriever.search("running shoes", top_k=4)
|
| 21 |
+
assert len(results) == 4
|
| 22 |
+
|
| 23 |
+
def test_results_have_scores(self, retriever):
|
| 24 |
+
results = retriever.search("water bottle")
|
| 25 |
+
for r in results:
|
| 26 |
+
assert "score" in r
|
| 27 |
+
assert "product" in r
|
| 28 |
+
assert 0.0 <= r["score"] <= 1.0
|
| 29 |
+
|
| 30 |
+
def test_price_filtering_under_30(self, retriever):
|
| 31 |
+
results = retriever.search("shoes under $30", top_k=4)
|
| 32 |
+
for r in results:
|
| 33 |
+
assert r["product"]["price"] <= 30.0, (
|
| 34 |
+
f"Product '{r['product']['title']}' costs ${r['product']['price']} "
|
| 35 |
+
f"but should be under $30"
|
| 36 |
+
)
|
| 37 |
+
|
| 38 |
+
def test_price_filtering_under_50(self, retriever):
|
| 39 |
+
results = retriever.search("I only have $50 to spend", top_k=4)
|
| 40 |
+
for r in results:
|
| 41 |
+
assert r["product"]["price"] <= 50.0
|
| 42 |
+
|
| 43 |
+
def test_eco_category_relevance(self, retriever):
|
| 44 |
+
results = retriever.search("eco-friendly sustainable products", top_k=4)
|
| 45 |
+
eco_count = sum(1 for r in results if r["product"]["category"] == "eco-friendly")
|
| 46 |
+
assert eco_count >= 2, f"Expected ≥2 eco products, got {eco_count}"
|
| 47 |
+
|
| 48 |
+
def test_winter_category_relevance(self, retriever):
|
| 49 |
+
results = retriever.search("warm winter jacket for cold weather", top_k=4)
|
| 50 |
+
winter_count = sum(1 for r in results if r["product"]["category"] == "winter")
|
| 51 |
+
assert winter_count >= 2, f"Expected ≥2 winter products, got {winter_count}"
|
| 52 |
+
|
| 53 |
+
def test_results_sorted_by_score(self, retriever):
|
| 54 |
+
results = retriever.search("fitness watch with GPS", top_k=4)
|
| 55 |
+
scores = [r["score"] for r in results]
|
| 56 |
+
assert scores == sorted(scores, reverse=True), "Results not sorted by score"
|
| 57 |
+
|
| 58 |
+
def test_empty_query_returns_results(self, retriever):
|
| 59 |
+
results = retriever.search("", top_k=4)
|
| 60 |
+
assert len(results) == 4 # Should gracefully handle empty queries
|