Commit ·
0ed43fe
0
Parent(s):
Initial commit
Browse files- .gitignore +9 -0
- README.md +93 -0
- app.py +860 -0
- requirements.txt +5 -0
- task2_segmentation.py +240 -0
- task3_4_routing.py +333 -0
- task5_forecasting.py +137 -0
.gitignore
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
output/
|
| 2 |
+
__pycache__/
|
| 3 |
+
*.pyc
|
| 4 |
+
.env
|
| 5 |
+
*.png
|
| 6 |
+
*.docx
|
| 7 |
+
*.pdf
|
| 8 |
+
.claude/
|
| 9 |
+
EcoCart_Report.docx
|
README.md
ADDED
|
@@ -0,0 +1,93 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# EcoCart AI System
|
| 2 |
+
|
| 3 |
+
An interactive AI-powered logistics simulation
|
| 4 |
+
|
| 5 |
+
🚀 **Live Demo:** [Launch on Streamlit](https://esvanth-ecocart-ai.streamlit.app)
|
| 6 |
+
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
## What is EcoCart?
|
| 10 |
+
|
| 11 |
+
EcoCart is a mid-sized e-commerce company facing challenges in optimising its logistics network. This project proposes an AI-based solution across five tasks — from intelligent delivery agents to demand forecasting.
|
| 12 |
+
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
## Tasks Covered
|
| 16 |
+
|
| 17 |
+
### Task 1 — AI Agents
|
| 18 |
+
Demonstrates three types of AI agents navigating a delivery map in real time:
|
| 19 |
+
- **Reactive Agent** — goes to the nearest stop, no planning
|
| 20 |
+
- **Goal-Based Agent** — plans the full route before departing (2-opt optimised)
|
| 21 |
+
- **Utility-Based Agent** — balances urgency vs distance to prioritise high-value stops
|
| 22 |
+
|
| 23 |
+
### Task 2 — Bias Detection & Mitigation
|
| 24 |
+
Uses K-Means clustering to segment customers into value tiers. Detects urban/rural bias using **Disparate Impact (DI)** analysis and applies a three-step mitigation strategy:
|
| 25 |
+
- Oversample rural customers to balance the dataset
|
| 26 |
+
- Adjust spend for delivery cost premium (+€12)
|
| 27 |
+
- Adjust frequency for rural order batching (×1.5)
|
| 28 |
+
|
| 29 |
+
### Task 3 — Search Algorithms for Route Optimisation
|
| 30 |
+
Implements all four search algorithms on a 20-node urban/rural delivery network:
|
| 31 |
+
- **BFS** — Breadth-First Search
|
| 32 |
+
- **DFS** — Depth-First Search
|
| 33 |
+
- **A\*** — Best-first with Euclidean heuristic
|
| 34 |
+
- **IDA\*** — Iterative Deepening A*
|
| 35 |
+
|
| 36 |
+
Includes a live **exploration replay slider** — drag to watch the algorithm search node by node.
|
| 37 |
+
|
| 38 |
+
### Task 4 — A* vs IDA* Comparative Analysis
|
| 39 |
+
Benchmarks both algorithms on 10 origin-destination pairs (5 urban, 5 rural) over multiple timing runs. Compares nodes expanded, average time, and memory behaviour.
|
| 40 |
+
|
| 41 |
+
### Task 5 — Demand Forecasting
|
| 42 |
+
Trains two ML models on 730 days of synthetic sales data:
|
| 43 |
+
- **Linear Regression** — fast and interpretable
|
| 44 |
+
- **Random Forest** — captures non-linear seasonal patterns
|
| 45 |
+
|
| 46 |
+
Features a **what-if predictor** — enter any day, month, and promotion flag to get an instant sales prediction.
|
| 47 |
+
|
| 48 |
+
---
|
| 49 |
+
|
| 50 |
+
## Tech Stack
|
| 51 |
+
|
| 52 |
+
| Tool | Purpose |
|
| 53 |
+
|------|---------|
|
| 54 |
+
| Python 3.11 | Core language |
|
| 55 |
+
| Streamlit | Interactive web app |
|
| 56 |
+
| Plotly | Interactive charts |
|
| 57 |
+
| scikit-learn | K-Means, LR, Random Forest |
|
| 58 |
+
| NumPy / Pandas | Data processing |
|
| 59 |
+
|
| 60 |
+
---
|
| 61 |
+
|
| 62 |
+
## Run Locally
|
| 63 |
+
|
| 64 |
+
```bash
|
| 65 |
+
git clone https://github.com/Esvanth/Ecocart-AI.git
|
| 66 |
+
cd Ecocart-AI
|
| 67 |
+
pip install -r requirements.txt
|
| 68 |
+
streamlit run app.py
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
---
|
| 72 |
+
|
| 73 |
+
## Project Structure
|
| 74 |
+
|
| 75 |
+
```
|
| 76 |
+
Ecocart-AI/
|
| 77 |
+
├── app.py # Main Streamlit app (all 5 tasks)
|
| 78 |
+
├── task2_segmentation.py # Standalone Task 2 script
|
| 79 |
+
├── task3_4_routing.py # Standalone Tasks 3 & 4 script
|
| 80 |
+
├── task5_forecasting.py # Standalone Task 5 script
|
| 81 |
+
├── requirements.txt # Python dependencies
|
| 82 |
+
└── README.md
|
| 83 |
+
```
|
| 84 |
+
|
| 85 |
+
---
|
| 86 |
+
|
| 87 |
+
## Author
|
| 88 |
+
|
| 89 |
+
**Esvanth Mohankumar**
|
| 90 |
+
Student ID: 24311073
|
| 91 |
+
Programme: MSc Artificial Intelligence
|
| 92 |
+
Institution: National College of Ireland
|
| 93 |
+
Module: Foundations of AI
|
app.py
ADDED
|
@@ -0,0 +1,860 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
EcoCart AI System — TABA Section II
|
| 3 |
+
NCI MSCAI 2026
|
| 4 |
+
"""
|
| 5 |
+
|
| 6 |
+
import math, heapq, time
|
| 7 |
+
from collections import deque
|
| 8 |
+
|
| 9 |
+
import numpy as np
|
| 10 |
+
import pandas as pd
|
| 11 |
+
import plotly.graph_objects as go
|
| 12 |
+
from plotly.subplots import make_subplots
|
| 13 |
+
import streamlit as st
|
| 14 |
+
from sklearn.cluster import KMeans
|
| 15 |
+
from sklearn.preprocessing import StandardScaler
|
| 16 |
+
from sklearn.linear_model import LinearRegression
|
| 17 |
+
from sklearn.ensemble import RandomForestRegressor
|
| 18 |
+
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
|
| 19 |
+
|
| 20 |
+
# ── page ──────────────────────────────────────────────────────────────────────
|
| 21 |
+
st.set_page_config(page_title="EcoCart AI", layout="wide",
|
| 22 |
+
initial_sidebar_state="collapsed")
|
| 23 |
+
|
| 24 |
+
st.markdown("""
|
| 25 |
+
<style>
|
| 26 |
+
[data-testid="stAppViewContainer"] { background:#f0f4f8; }
|
| 27 |
+
[data-testid="stHeader"] { background:transparent; }
|
| 28 |
+
.block-container { padding:1rem 2rem 3rem; }
|
| 29 |
+
.stTabs [data-baseweb="tab-list"] { background:#fff; border-radius:12px;
|
| 30 |
+
padding:4px; box-shadow:0 1px 4px rgba(0,0,0,.08); }
|
| 31 |
+
.stTabs [data-baseweb="tab"] { font-size:.88rem; font-weight:600;
|
| 32 |
+
border-radius:8px; padding:8px 20px; }
|
| 33 |
+
div[data-testid="metric-container"]{ background:#fff; border-radius:10px;
|
| 34 |
+
padding:14px 18px;
|
| 35 |
+
box-shadow:0 1px 4px rgba(0,0,0,.07); }
|
| 36 |
+
.card { background:#fff; border-radius:14px; padding:20px 24px;
|
| 37 |
+
box-shadow:0 1px 5px rgba(0,0,0,.08); margin-bottom:14px; }
|
| 38 |
+
.badge-green { display:inline-block; background:#d1fae5; color:#065f46;
|
| 39 |
+
border-radius:99px; padding:3px 12px; font-size:.78rem;
|
| 40 |
+
font-weight:700; }
|
| 41 |
+
.badge-red { display:inline-block; background:#fee2e2; color:#991b1b;
|
| 42 |
+
border-radius:99px; padding:3px 12px; font-size:.78rem;
|
| 43 |
+
font-weight:700; }
|
| 44 |
+
.badge-blue { display:inline-block; background:#dbeafe; color:#1e40af;
|
| 45 |
+
border-radius:99px; padding:3px 12px; font-size:.78rem;
|
| 46 |
+
font-weight:700; }
|
| 47 |
+
.tip { background:#f8fafc; border:1px solid #e2e8f0; border-radius:8px;
|
| 48 |
+
padding:10px 14px; font-size:.82rem; color:#475569; margin:8px 0; }
|
| 49 |
+
.section-label { font-size:.72rem; font-weight:700; letter-spacing:.08em;
|
| 50 |
+
color:#94a3b8; text-transform:uppercase; margin-bottom:4px; }
|
| 51 |
+
</style>
|
| 52 |
+
""", unsafe_allow_html=True)
|
| 53 |
+
|
| 54 |
+
# ── colours ───────────────────────────────────────────────────────────────────
|
| 55 |
+
BG,SURF,LINE = "#f0f4f8","#ffffff","#e2e8f0"
|
| 56 |
+
FG,MUTE = "#1e293b","#64748b"
|
| 57 |
+
GREEN,BLUE,RED,AMBER,PURPLE = "#10b981","#3b82f6","#ef4444","#f59e0b","#8b5cf6"
|
| 58 |
+
|
| 59 |
+
SEG_COL={"High Value":GREEN,"Medium":AMBER,"Low Value":RED,"Group 4":PURPLE}
|
| 60 |
+
|
| 61 |
+
def _ch(h=380,title=""):
|
| 62 |
+
return dict(height=h,paper_bgcolor=SURF,plot_bgcolor=BG,
|
| 63 |
+
font=dict(color=FG,size=11),
|
| 64 |
+
title=dict(text=title,font=dict(size=13,color=FG),x=0),
|
| 65 |
+
margin=dict(l=50,r=20,t=48,b=40),
|
| 66 |
+
legend=dict(bgcolor=SURF,bordercolor=LINE,borderwidth=1))
|
| 67 |
+
|
| 68 |
+
def _xax(**k): return dict(gridcolor=LINE,zeroline=False,linecolor=LINE,**k)
|
| 69 |
+
def _yax(**k): return dict(gridcolor=LINE,zeroline=False,linecolor=LINE,**k)
|
| 70 |
+
|
| 71 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 72 |
+
# NETWORK DATA
|
| 73 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 74 |
+
NODES={
|
| 75 |
+
"U1":(1.0,1.0,"urban"), "U2":(2.0,1.5,"urban"), "U3":(3.0,1.0,"urban"),
|
| 76 |
+
"U4":(1.5,2.5,"urban"), "U5":(2.5,3.0,"urban"), "U6":(3.5,2.0,"urban"),
|
| 77 |
+
"U7":(1.0,3.5,"urban"), "U8":(2.0,4.0,"urban"), "U9":(3.0,4.0,"urban"),
|
| 78 |
+
"U10":(4.0,3.5,"urban"),
|
| 79 |
+
"R1":(6.0,1.0,"rural"), "R2":(8.0,2.0,"rural"), "R3":(10.0,1.5,"rural"),
|
| 80 |
+
"R4":(7.0,4.0,"rural"), "R5":(9.0,4.5,"rural"), "R6":(11.0,3.5,"rural"),
|
| 81 |
+
"R7":(6.5,6.0,"rural"), "R8":(9.0,7.0,"rural"), "R9":(11.0,6.0,"rural"),
|
| 82 |
+
"R10":(8.0,5.5,"rural"),
|
| 83 |
+
}
|
| 84 |
+
_EP=[("U1","U2"),("U2","U3"),("U1","U4"),("U2","U4"),("U2","U5"),
|
| 85 |
+
("U3","U6"),("U4","U5"),("U5","U6"),("U4","U7"),("U5","U8"),
|
| 86 |
+
("U6","U10"),("U7","U8"),("U8","U9"),("U9","U10"),("U5","U9"),
|
| 87 |
+
("R1","R2"),("R2","R3"),("R1","R4"),("R2","R4"),("R3","R6"),
|
| 88 |
+
("R4","R5"),("R5","R6"),("R4","R7"),("R5","R10"),("R7","R10"),
|
| 89 |
+
("R7","R8"),("R8","R9"),("R6","R9"),("R8","R10"),("R5","R8"),
|
| 90 |
+
("U3","R1"),("U10","R4"),("U6","R1"),("U9","R7")]
|
| 91 |
+
|
| 92 |
+
def _nd(a,b): return math.hypot(NODES[a][0]-NODES[b][0],NODES[a][1]-NODES[b][1])
|
| 93 |
+
def _cr(a,b):
|
| 94 |
+
za,zb=NODES[a][2],NODES[b][2]
|
| 95 |
+
return 0.28 if za==zb=="urban" else 0.18 if za!=zb else 0.10
|
| 96 |
+
|
| 97 |
+
EDGES =[(a,b,round(_nd(a,b)*1.15,2)) for a,b in _EP]
|
| 98 |
+
CO2_EDGES=[(a,b,round(_nd(a,b)*1.15*_cr(a,b),3)) for a,b in _EP]
|
| 99 |
+
ADJ_KM={n:[] for n in NODES}; ADJ_CO2={n:[] for n in NODES}
|
| 100 |
+
for i,(a,b,w) in enumerate(EDGES):
|
| 101 |
+
ADJ_KM[a].append((b,w)); ADJ_KM[b].append((a,w))
|
| 102 |
+
c=CO2_EDGES[i][2]; ADJ_CO2[a].append((b,c)); ADJ_CO2[b].append((a,c))
|
| 103 |
+
|
| 104 |
+
def _ew(a,b,adj):
|
| 105 |
+
for nb,w in adj[a]:
|
| 106 |
+
if nb==b: return w
|
| 107 |
+
return math.inf
|
| 108 |
+
|
| 109 |
+
# ── algorithms (return path, cost, exploration_order) ─────────────────────────
|
| 110 |
+
def bfs(s,g,adj):
|
| 111 |
+
q=deque([(s,[s])]); seen={s}; expl=[]
|
| 112 |
+
while q:
|
| 113 |
+
n,p=q.popleft(); expl.append(n)
|
| 114 |
+
if n==g:
|
| 115 |
+
return p,round(sum(_ew(p[i],p[i+1],adj) for i in range(len(p)-1)),2),expl
|
| 116 |
+
for nb,_ in adj[n]:
|
| 117 |
+
if nb not in seen: seen.add(nb); q.append((nb,p+[nb]))
|
| 118 |
+
return None,0.0,expl
|
| 119 |
+
|
| 120 |
+
def dfs(s,g,adj):
|
| 121 |
+
stack=[(s,[s])]; seen={s}; expl=[]
|
| 122 |
+
while stack:
|
| 123 |
+
n,p=stack.pop(); expl.append(n)
|
| 124 |
+
if n==g:
|
| 125 |
+
return p,round(sum(_ew(p[i],p[i+1],adj) for i in range(len(p)-1)),2),expl
|
| 126 |
+
if len(p)>=50: continue
|
| 127 |
+
for nb,_ in adj[n]:
|
| 128 |
+
if nb not in seen: seen.add(nb); stack.append((nb,p+[nb]))
|
| 129 |
+
return None,0.0,expl
|
| 130 |
+
|
| 131 |
+
def astar(s,g,adj):
|
| 132 |
+
ctr=0; h=lambda n:_nd(n,g); expl=[]
|
| 133 |
+
heap=[(h(s),0.0,ctr,s,[s])]; best={s:0.0}
|
| 134 |
+
while heap:
|
| 135 |
+
_,gc,_,n,p=heapq.heappop(heap)
|
| 136 |
+
if n==g: return p,round(gc,2),expl
|
| 137 |
+
if gc>best.get(n,math.inf): continue
|
| 138 |
+
expl.append(n)
|
| 139 |
+
for nb,w in adj[n]:
|
| 140 |
+
ng=gc+w
|
| 141 |
+
if ng<best.get(nb,math.inf):
|
| 142 |
+
best[nb]=ng; ctr+=1
|
| 143 |
+
heapq.heappush(heap,(ng+h(nb),ng,ctr,nb,p+[nb]))
|
| 144 |
+
return None,0.0,expl
|
| 145 |
+
|
| 146 |
+
def ida_star(s,g,adj):
|
| 147 |
+
expl=[]; h=lambda n:_nd(n,g)
|
| 148 |
+
def _dfs(n,gc,bound,path,vis):
|
| 149 |
+
f=gc+h(n)
|
| 150 |
+
if f>bound: return None,f
|
| 151 |
+
expl.append(n)
|
| 152 |
+
if n==g: return list(path),gc
|
| 153 |
+
nxt=math.inf
|
| 154 |
+
for nb,w in adj[n]:
|
| 155 |
+
if nb in vis: continue
|
| 156 |
+
vis.add(nb); path.append(nb)
|
| 157 |
+
r,t=_dfs(nb,gc+w,bound,path,vis)
|
| 158 |
+
if r is not None: return r,t
|
| 159 |
+
if t<nxt: nxt=t
|
| 160 |
+
path.pop(); vis.remove(nb)
|
| 161 |
+
return None,nxt
|
| 162 |
+
bound=h(s)
|
| 163 |
+
while True:
|
| 164 |
+
r,t=_dfs(s,0.0,bound,[s],{s})
|
| 165 |
+
if r is not None: return r,round(t,2),expl
|
| 166 |
+
if t==math.inf: return None,0.0,expl
|
| 167 |
+
bound=t
|
| 168 |
+
|
| 169 |
+
ALGOS={"BFS":bfs,"DFS":dfs,"A*":astar,"IDA*":ida_star}
|
| 170 |
+
|
| 171 |
+
# ── network figure builder ────────────────────────────────────────────────────
|
| 172 |
+
def build_network(sn,en,path,explored_so_far,adj,unit,algo_name):
|
| 173 |
+
pc=GREEN if unit=="CO2" else AMBER
|
| 174 |
+
path_set=set(path) if path else set()
|
| 175 |
+
fig=go.Figure()
|
| 176 |
+
|
| 177 |
+
# edges
|
| 178 |
+
for a,b,w in EDGES:
|
| 179 |
+
on_path=(a in path_set and b in path_set and
|
| 180 |
+
any((path[i]==a and path[i+1]==b) or
|
| 181 |
+
(path[i]==b and path[i+1]==a)
|
| 182 |
+
for i in range(len(path)-1)) if path else False)
|
| 183 |
+
lc=pc if on_path else "#dde3ed"
|
| 184 |
+
lw=5 if on_path else 1.5
|
| 185 |
+
co2w=_ew(a,b,ADJ_CO2)
|
| 186 |
+
fig.add_trace(go.Scatter(
|
| 187 |
+
x=[NODES[a][0],NODES[b][0],None],y=[NODES[a][1],NODES[b][1],None],
|
| 188 |
+
mode="lines",line=dict(color=lc,width=lw),
|
| 189 |
+
showlegend=False,hoverinfo="skip"))
|
| 190 |
+
|
| 191 |
+
# nodes
|
| 192 |
+
for zone,bc in [("urban","#ef4444"),("rural",GREEN)]:
|
| 193 |
+
ns=[(n,d) for n,d in NODES.items() if d[2]==zone]
|
| 194 |
+
cols,sizes=[],[]
|
| 195 |
+
for n,_ in ns:
|
| 196 |
+
if n==sn: cols.append("#fff"); sizes.append(28)
|
| 197 |
+
elif n==en: cols.append("#facc15"); sizes.append(28)
|
| 198 |
+
elif n in path_set:cols.append(pc); sizes.append(22)
|
| 199 |
+
elif n in explored_so_far: cols.append("#bfdbfe"); sizes.append(18)
|
| 200 |
+
else: cols.append(bc); sizes.append(18)
|
| 201 |
+
fig.add_trace(go.Scatter(
|
| 202 |
+
x=[d[0] for _,d in ns],y=[d[1] for _,d in ns],
|
| 203 |
+
mode="markers+text",name=zone.title(),
|
| 204 |
+
marker=dict(size=sizes,color=cols,line=dict(color=FG,width=1.5)),
|
| 205 |
+
text=[n for n,_ in ns],textposition="middle center",
|
| 206 |
+
textfont=dict(size=8,color=FG,family="monospace"),
|
| 207 |
+
hovertemplate="<b>%{text}</b><br>"+zone+"<extra></extra>"))
|
| 208 |
+
|
| 209 |
+
title=(f"{algo_name}: {sn} → {en} | "
|
| 210 |
+
f"{'Explored '+str(len(explored_so_far))+' nodes' if explored_so_far else 'Ready'}")
|
| 211 |
+
fig.update_layout(**_ch(480,title))
|
| 212 |
+
fig.update_layout(legend=dict(bgcolor=SURF,bordercolor=LINE,x=0.01,y=0.99))
|
| 213 |
+
fig.update_xaxes(showgrid=False,showticklabels=False,zeroline=False)
|
| 214 |
+
fig.update_yaxes(showgrid=False,showticklabels=False,zeroline=False)
|
| 215 |
+
return fig
|
| 216 |
+
|
| 217 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 218 |
+
# AGENT SIMULATION
|
| 219 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 220 |
+
STOPS={
|
| 221 |
+
"Depot": (0.0,0.0,0), "Shop A":(2.0,3.0,3), "Shop B":(5.0,1.0,4),
|
| 222 |
+
"Shop C":(7.0,4.0,2), "Shop D":(3.0,6.0,5), "Shop E":(8.0,7.0,1),
|
| 223 |
+
"Shop F":(1.0,8.0,3), "Shop G":(6.0,9.0,4), "Shop H":(9.0,2.0,2),
|
| 224 |
+
}
|
| 225 |
+
def _sd(a,b): ax,ay,_=STOPS[a]; bx,by,_=STOPS[b]; return math.hypot(ax-bx,ay-by)
|
| 226 |
+
|
| 227 |
+
def _reactive():
|
| 228 |
+
r=["Depot"]; u=[k for k in STOPS if k!="Depot"]; cur="Depot"
|
| 229 |
+
while u: nb=min(u,key=lambda n:_sd(cur,n)); r.append(nb); u.remove(nb); cur=nb
|
| 230 |
+
return r+["Depot"]
|
| 231 |
+
|
| 232 |
+
def _goal():
|
| 233 |
+
r=_reactive()[:-1]
|
| 234 |
+
td=lambda x:sum(_sd(x[i],x[i+1]) for i in range(len(x)-1))+_sd(x[-1],x[0])
|
| 235 |
+
ok=True
|
| 236 |
+
while ok:
|
| 237 |
+
ok=False
|
| 238 |
+
for i in range(1,len(r)-1):
|
| 239 |
+
for j in range(i+1,len(r)):
|
| 240 |
+
nr=r[:i]+r[i:j+1][::-1]+r[j+1:]
|
| 241 |
+
if td(nr)<td(r)-1e-9: r=nr; ok=True
|
| 242 |
+
return r+["Depot"]
|
| 243 |
+
|
| 244 |
+
def _utility():
|
| 245 |
+
r=["Depot"]; u=[k for k in STOPS if k!="Depot"]; cur="Depot"
|
| 246 |
+
while u:
|
| 247 |
+
nb=max(u,key=lambda n:STOPS[n][2]/(_sd(cur,n)+.1))
|
| 248 |
+
r.append(nb); u.remove(nb); cur=nb
|
| 249 |
+
return r+["Depot"]
|
| 250 |
+
|
| 251 |
+
ROUTES={"Nearest stop":_reactive(),"Planned route":_goal(),"Priority first":_utility()}
|
| 252 |
+
AGENT_COL={"Nearest stop":BLUE,"Planned route":GREEN,"Priority first":AMBER}
|
| 253 |
+
AGENT_DESC={
|
| 254 |
+
"Nearest stop": "Reactive agent — goes to the closest unvisited stop. Simple and fast, no planning.",
|
| 255 |
+
"Planned route": "Goal-based agent — computes the shortest full route before departing.",
|
| 256 |
+
"Priority first":"Utility-based agent — balances urgency vs distance. Starred stops are served first.",
|
| 257 |
+
}
|
| 258 |
+
|
| 259 |
+
def _route_km(r): return round(sum(_sd(r[i],r[i+1]) for i in range(len(r)-1)),2)
|
| 260 |
+
|
| 261 |
+
def draw_agent(route,step,ac):
|
| 262 |
+
visited=set(route[:step+1]); pso=route[:step+1]
|
| 263 |
+
km=sum(_sd(pso[i],pso[i+1]) for i in range(len(pso)-1))
|
| 264 |
+
cur=route[step]
|
| 265 |
+
fig=go.Figure()
|
| 266 |
+
for na in STOPS:
|
| 267 |
+
for nb in STOPS:
|
| 268 |
+
if na>=nb: continue
|
| 269 |
+
x1,y1,_=STOPS[na]; x2,y2,_=STOPS[nb]
|
| 270 |
+
if math.hypot(x1-x2,y1-y2)<5.5:
|
| 271 |
+
fig.add_trace(go.Scatter(x=[x1,x2,None],y=[y1,y2,None],mode="lines",
|
| 272 |
+
line=dict(color="#e2e8f0",width=1),showlegend=False,hoverinfo="skip"))
|
| 273 |
+
if len(pso)>1:
|
| 274 |
+
fig.add_trace(go.Scatter(
|
| 275 |
+
x=[STOPS[n][0] for n in pso],y=[STOPS[n][1] for n in pso],
|
| 276 |
+
mode="lines+markers",line=dict(color=ac,width=3),
|
| 277 |
+
marker=dict(size=6,color=ac),showlegend=False,hoverinfo="skip"))
|
| 278 |
+
for name,(nx,ny,pri) in STOPS.items():
|
| 279 |
+
if name=="Depot": nc,sz,sym="#3b82f6",26,"square"
|
| 280 |
+
elif name==cur: nc,sz,sym=ac,28,"circle"
|
| 281 |
+
elif name in visited: nc,sz,sym=GREEN,18,"circle"
|
| 282 |
+
else: nc,sz,sym="#cbd5e1",18,"circle"
|
| 283 |
+
label=("⭐" if pri>=4 else "")+" "+name.replace("Shop ","")
|
| 284 |
+
fig.add_trace(go.Scatter(x=[nx],y=[ny],mode="markers+text",showlegend=False,
|
| 285 |
+
marker=dict(size=sz,color=nc,line=dict(color="#fff",width=2)),
|
| 286 |
+
text=[label.strip()],textposition="top center",textfont=dict(size=9,color=FG),
|
| 287 |
+
hovertemplate=f"<b>{name}</b><br>Priority {pri}/5<br>{'✓ Visited' if name in visited else 'Pending'}<extra></extra>"))
|
| 288 |
+
fig.update_layout(**_ch(400,f"Step {step}/{len(route)-1} — {km:.1f} km so far"))
|
| 289 |
+
fig.update_xaxes(showgrid=False,showticklabels=False,zeroline=False,range=[-0.5,10.5])
|
| 290 |
+
fig.update_yaxes(showgrid=False,showticklabels=False,zeroline=False,range=[-0.5,10.5])
|
| 291 |
+
return fig, round(km,2)
|
| 292 |
+
|
| 293 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 294 |
+
# SEGMENTATION
|
| 295 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 296 |
+
@st.cache_data
|
| 297 |
+
def _customers(nu,nr):
|
| 298 |
+
rng=np.random.default_rng(42)
|
| 299 |
+
u=pd.DataFrame({"freq":rng.normal(6,2,nu).clip(.5),"spend":rng.normal(120,40,nu).clip(10),
|
| 300 |
+
"recency":rng.exponential(10,nu).clip(1,90),"region":"urban"})
|
| 301 |
+
r=pd.DataFrame({"freq":rng.normal(3,1.5,nr).clip(.5),"spend":rng.normal(65,30,nr).clip(10),
|
| 302 |
+
"recency":rng.exponential(15,nr).clip(1,90),"region":"rural"})
|
| 303 |
+
return pd.concat([u,r],ignore_index=True).round(1)
|
| 304 |
+
|
| 305 |
+
def _kmeans(df,k):
|
| 306 |
+
X=StandardScaler().fit_transform(df[["freq","spend","recency"]])
|
| 307 |
+
df=df.copy(); df["cluster"]=KMeans(n_clusters=k,random_state=42,n_init=10).fit_predict(X)
|
| 308 |
+
order=df.groupby("cluster")["spend"].mean().sort_values(ascending=False).index
|
| 309 |
+
names=(["High Value","Medium","Low Value","Group 4"])[:k]
|
| 310 |
+
df["segment"]=df["cluster"].map({order[i]:names[i] for i in range(k)})
|
| 311 |
+
return df
|
| 312 |
+
|
| 313 |
+
def _di(df):
|
| 314 |
+
u=(df[df.region=="urban"].segment=="High Value").mean()
|
| 315 |
+
r=(df[df.region=="rural"].segment=="High Value").mean()
|
| 316 |
+
return round(u*100,1),round(r*100,1),round(r/u if u else 0,3)
|
| 317 |
+
|
| 318 |
+
@st.cache_data
|
| 319 |
+
def _fix(nu,nr,k):
|
| 320 |
+
df=_customers(nu,nr)
|
| 321 |
+
bal=pd.concat([df[df.region=="urban"],
|
| 322 |
+
df[df.region=="rural"].sample(len(df[df.region=="urban"]),replace=True,random_state=42)],
|
| 323 |
+
ignore_index=True).copy()
|
| 324 |
+
bal.loc[bal.region=="rural","spend"]+=12
|
| 325 |
+
bal.loc[bal.region=="rural","freq"]*=1.5
|
| 326 |
+
bal=_kmeans(bal,k)
|
| 327 |
+
rm=bal.region=="rural"; um=bal.region=="urban"
|
| 328 |
+
need=int((bal[um].segment=="High Value").mean()*.85*rm.sum())-(bal[rm].segment=="High Value").sum()
|
| 329 |
+
if need>0:
|
| 330 |
+
cands=bal[rm&(bal.segment!="High Value")]
|
| 331 |
+
bal.loc[cands.nlargest(min(need,len(cands)),"spend").index,"segment"]="High Value"
|
| 332 |
+
return bal
|
| 333 |
+
|
| 334 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 335 |
+
# FORECASTING
|
| 336 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 337 |
+
@st.cache_data
|
| 338 |
+
def _sales():
|
| 339 |
+
rng=np.random.default_rng(42); days=730
|
| 340 |
+
t=np.arange(days); dates=pd.date_range("2023-01-01",periods=days,freq="D")
|
| 341 |
+
promo=np.zeros(days); promo[rng.choice(days,int(days*.06),replace=False)]=rng.uniform(30,70,int(days*.06))
|
| 342 |
+
sales=np.clip(100+.05*t+25*np.sin(2*np.pi*t/7)+40*np.sin(2*np.pi*t/365)+rng.normal(0,8,days)+promo,0,None)
|
| 343 |
+
df=pd.DataFrame({"date":dates,"sales":sales,"dow":dates.dayofweek,"month":dates.month,
|
| 344 |
+
"day_of_year":dates.dayofyear,"is_promo":(promo>0).astype(int)})
|
| 345 |
+
for l in [1,7,14]: df[f"lag_{l}"]=df["sales"].shift(l)
|
| 346 |
+
df["roll_7"]=df["sales"].shift(1).rolling(7).mean()
|
| 347 |
+
df["roll_30"]=df["sales"].shift(1).rolling(30).mean()
|
| 348 |
+
return df.dropna().reset_index(drop=True)
|
| 349 |
+
|
| 350 |
+
FEATS=["dow","month","day_of_year","is_promo","lag_1","lag_7","lag_14","roll_7","roll_30"]
|
| 351 |
+
FEAT_LABELS={"lag_7":"Sales 7 days ago","lag_1":"Yesterday's sales","lag_14":"Sales 14 days ago",
|
| 352 |
+
"roll_7":"7-day average","roll_30":"30-day average","is_promo":"Promotion active",
|
| 353 |
+
"day_of_year":"Day of year","month":"Month","dow":"Day of week"}
|
| 354 |
+
|
| 355 |
+
@st.cache_data
|
| 356 |
+
def _train(tp,ne):
|
| 357 |
+
df=_sales(); sp=int(len(df)*tp/100); tr,te=df.iloc[:sp],df.iloc[sp:]
|
| 358 |
+
lr=LinearRegression().fit(tr[FEATS],tr["sales"])
|
| 359 |
+
rf=RandomForestRegressor(n_estimators=ne,max_depth=12,min_samples_leaf=3,
|
| 360 |
+
random_state=42,n_jobs=-1).fit(tr[FEATS],tr["sales"])
|
| 361 |
+
lp=lr.predict(te[FEATS]); rp=rf.predict(te[FEATS])
|
| 362 |
+
return lr,rf,te,lp,rp,rf.feature_importances_
|
| 363 |
+
|
| 364 |
+
def _met(y,yh):
|
| 365 |
+
return (round(mean_absolute_error(y,yh),1),
|
| 366 |
+
round(mean_squared_error(y,yh)**.5,1),
|
| 367 |
+
round(r2_score(y,yh),3),
|
| 368 |
+
round(np.mean(np.abs((y-yh)/np.where(y==0,1,y)))*100,1))
|
| 369 |
+
|
| 370 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 371 |
+
# HEADER
|
| 372 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 373 |
+
st.markdown("<h2 style='margin:0 0 12px;color:#1e293b'>🛒 EcoCart AI System</h2>",
|
| 374 |
+
unsafe_allow_html=True)
|
| 375 |
+
|
| 376 |
+
T1,T2,T3,T4,T5=st.tabs([
|
| 377 |
+
"🤖 Task 1 — AI Agents",
|
| 378 |
+
"⚖️ Task 2 — Bias Check",
|
| 379 |
+
"🗺️ Task 3 — Route Finder",
|
| 380 |
+
"📊 Task 4 — Speed Test",
|
| 381 |
+
"📈 Task 5 — Sales Forecast",
|
| 382 |
+
])
|
| 383 |
+
|
| 384 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 385 |
+
# TASK 1
|
| 386 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 387 |
+
with T1:
|
| 388 |
+
st.markdown("### Watch the AI delivery agent navigate in real time")
|
| 389 |
+
st.caption("Three different AI strategies — pick one and press Play to watch it move stop by stop.")
|
| 390 |
+
|
| 391 |
+
# ── agent picker ──────────────────────────────────────────────────────────
|
| 392 |
+
a_cols=st.columns(3)
|
| 393 |
+
agent_names=list(ROUTES.keys())
|
| 394 |
+
if "agent" not in st.session_state: st.session_state.agent="Nearest stop"
|
| 395 |
+
|
| 396 |
+
for i,(col,name) in enumerate(zip(a_cols,agent_names)):
|
| 397 |
+
km=_route_km(ROUTES[name])
|
| 398 |
+
active=st.session_state.agent==name
|
| 399 |
+
border=f"3px solid {AGENT_COL[name]}" if active else "2px solid #e2e8f0"
|
| 400 |
+
bg=f"{AGENT_COL[name]}12" if active else "#fff"
|
| 401 |
+
if col.button(f"{'✓ ' if active else ''}{name} ({km} km)",
|
| 402 |
+
key=f"ab_{name}",use_container_width=True):
|
| 403 |
+
st.session_state.agent=name
|
| 404 |
+
st.session_state.stp=0
|
| 405 |
+
st.session_state.playing=False
|
| 406 |
+
|
| 407 |
+
agent=st.session_state.agent
|
| 408 |
+
ac=AGENT_COL[agent]
|
| 409 |
+
route=ROUTES[agent]; mx=len(route)-1
|
| 410 |
+
|
| 411 |
+
# ── playback controls ─────────────────────────────────────────────────────
|
| 412 |
+
ctl=st.columns([1,1,1,1,3])
|
| 413 |
+
if ctl[0].button("⏮ Start"):
|
| 414 |
+
st.session_state.stp=0; st.session_state.playing=False
|
| 415 |
+
if ctl[1].button("◀ Back") and st.session_state.get("stp",0)>0:
|
| 416 |
+
st.session_state.stp-=1; st.session_state.playing=False
|
| 417 |
+
if ctl[2].button("▶ Next") and st.session_state.get("stp",0)<mx:
|
| 418 |
+
st.session_state.stp+=1; st.session_state.playing=False
|
| 419 |
+
playing=st.session_state.get("playing",False)
|
| 420 |
+
if ctl[3].button("⏸ Pause" if playing else "▶ Play"):
|
| 421 |
+
st.session_state.playing=not playing
|
| 422 |
+
|
| 423 |
+
speed=ctl[4].slider("Speed",1,8,3,label_visibility="collapsed",
|
| 424 |
+
help="Animation speed (steps per second)")
|
| 425 |
+
|
| 426 |
+
stp=st.session_state.get("stp",0)
|
| 427 |
+
|
| 428 |
+
fig_agent,km_done=draw_agent(route,stp,ac)
|
| 429 |
+
|
| 430 |
+
# ── map + stats ───────────────────────────────────────────────────────────
|
| 431 |
+
map_c,stat_c=st.columns([3,1])
|
| 432 |
+
with map_c:
|
| 433 |
+
st.plotly_chart(fig_agent,use_container_width=True,key="agent_map")
|
| 434 |
+
|
| 435 |
+
with stat_c:
|
| 436 |
+
st.markdown(f"<div class='section-label'>Current status</div>",unsafe_allow_html=True)
|
| 437 |
+
st.metric("Stops completed",f"{stp} / {mx}")
|
| 438 |
+
st.metric("Distance covered",f"{km_done} km")
|
| 439 |
+
psum=sum(STOPS[n][2] for n in route[:stp+1] if n!="Depot")
|
| 440 |
+
st.metric("Priority points served",psum)
|
| 441 |
+
st.markdown(" ")
|
| 442 |
+
st.markdown(f"<div class='tip'>{AGENT_DESC[agent]}</div>",unsafe_allow_html=True)
|
| 443 |
+
st.markdown("<div class='section-label' style='margin-top:12px'>All agents</div>",unsafe_allow_html=True)
|
| 444 |
+
for nm in agent_names:
|
| 445 |
+
km=_route_km(ROUTES[nm]); c=AGENT_COL[nm]
|
| 446 |
+
hi=next((i for i,n in enumerate(ROUTES[nm]) if n!="Depot" and STOPS[n][2]>=4),"-")
|
| 447 |
+
st.markdown(
|
| 448 |
+
f"<div style='border-left:3px solid {c};padding:6px 10px;"
|
| 449 |
+
f"margin:4px 0;background:{'#f8fafc' if nm!=agent else c+'12'};border-radius:0 6px 6px 0'>"
|
| 450 |
+
f"<b style='font-size:.82rem'>{nm}</b> "
|
| 451 |
+
f"<span style='color:{MUTE};font-size:.78rem'>{km} km · 1st star: step {hi}</span>"
|
| 452 |
+
f"</div>",unsafe_allow_html=True)
|
| 453 |
+
|
| 454 |
+
# ── auto-play ─────────────────────────────────────────────────────────────
|
| 455 |
+
if st.session_state.get("playing") and stp<mx:
|
| 456 |
+
time.sleep(1.0/speed)
|
| 457 |
+
st.session_state.stp=stp+1
|
| 458 |
+
st.rerun()
|
| 459 |
+
elif st.session_state.get("playing") and stp>=mx:
|
| 460 |
+
st.session_state.playing=False
|
| 461 |
+
|
| 462 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 463 |
+
# TASK 2
|
| 464 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 465 |
+
with T2:
|
| 466 |
+
st.markdown("### Are rural customers being treated fairly by the AI?")
|
| 467 |
+
st.caption("Adjust the sliders and watch the fairness score update instantly.")
|
| 468 |
+
|
| 469 |
+
ctrl,main=st.columns([1,3])
|
| 470 |
+
with ctrl:
|
| 471 |
+
nu=st.slider("Urban customers",100,500,300,50)
|
| 472 |
+
nr=st.slider("Rural customers",30,200,100,10)
|
| 473 |
+
k=st.slider("Groups (K-Means)",2,4,3,1)
|
| 474 |
+
fix=st.toggle("Apply fairness fix",True)
|
| 475 |
+
st.markdown(" ")
|
| 476 |
+
if fix:
|
| 477 |
+
st.markdown("""
|
| 478 |
+
<div class='tip'>
|
| 479 |
+
<b>What the fix does:</b><br><br>
|
| 480 |
+
• Rural customers pay ~€12 more per delivery — we add this back to their spend score<br>
|
| 481 |
+
• Rural customers batch orders (less frequent, bigger baskets) — we adjust their frequency<br>
|
| 482 |
+
• We balance the dataset so rural customers are equally represented during training
|
| 483 |
+
</div>""",unsafe_allow_html=True)
|
| 484 |
+
else:
|
| 485 |
+
st.markdown("""
|
| 486 |
+
<div class='tip'>
|
| 487 |
+
<b>Why bias happens:</b><br><br>
|
| 488 |
+
EcoCart launched in cities first. Urban customers have more data and appear to spend more on the surface.
|
| 489 |
+
The AI picks up this pattern and unfairly labels rural customers as low-value.
|
| 490 |
+
</div>""",unsafe_allow_html=True)
|
| 491 |
+
|
| 492 |
+
with main:
|
| 493 |
+
raw=_customers(nu,nr); seg_b=_kmeans(raw,k); ub,rb,dib=_di(seg_b)
|
| 494 |
+
if fix: seg_a=_fix(nu,nr,k); ua,ra,dia=_di(seg_a)
|
| 495 |
+
|
| 496 |
+
# ── big fairness indicator ────────────────────────────────────────────
|
| 497 |
+
mc=st.columns(4)
|
| 498 |
+
mc[0].metric("Urban in High Value",f"{ub}%")
|
| 499 |
+
mc[1].metric("Rural in High Value",f"{rb}%")
|
| 500 |
+
di_val=dia if fix else dib
|
| 501 |
+
di_delta=f"{dia-dib:+.2f}" if fix else None
|
| 502 |
+
mc[2].metric("Fairness score",f"{di_val:.2f}",delta=di_delta,
|
| 503 |
+
help="1.0 = perfectly equal. Aim: ≥ 0.80")
|
| 504 |
+
status="FAIR" if di_val>=0.8 else "NOT FAIR"
|
| 505 |
+
mc[3].markdown(
|
| 506 |
+
f"<div style='background:#fff;border-radius:10px;padding:14px 18px;"
|
| 507 |
+
f"box-shadow:0 1px 4px rgba(0,0,0,.07);text-align:center'>"
|
| 508 |
+
f"<div style='font-size:.8rem;color:{MUTE}'>Status</div>"
|
| 509 |
+
f"<div class='badge-{'green' if di_val>=.8 else 'red'}' "
|
| 510 |
+
f"style='font-size:.95rem;margin-top:6px'>{status}</div></div>",
|
| 511 |
+
unsafe_allow_html=True)
|
| 512 |
+
|
| 513 |
+
if di_val>=0.8: st.success(f"Fairness achieved — score {di_val:.2f} is above the 0.80 threshold.")
|
| 514 |
+
else: st.error(f"Score {di_val:.2f} is below 0.80 — rural customers are under-served.")
|
| 515 |
+
|
| 516 |
+
# ── scatter ───────────────────────────────────────────────────────────
|
| 517 |
+
def _scatter(df,title):
|
| 518 |
+
fig=go.Figure()
|
| 519 |
+
for seg in ["High Value","Medium","Low Value","Group 4"]:
|
| 520 |
+
if seg not in df.segment.values: continue
|
| 521 |
+
for region,sym in [("urban","circle"),("rural","triangle-up")]:
|
| 522 |
+
sub=df[(df.segment==seg)&(df.region==region)]
|
| 523 |
+
if sub.empty: continue
|
| 524 |
+
fig.add_trace(go.Scatter(x=sub.freq,y=sub.spend,mode="markers",
|
| 525 |
+
marker=dict(color=SEG_COL.get(seg,"#94a3b8"),symbol=sym,size=7,opacity=.72),
|
| 526 |
+
name=f"{seg} / {region}",
|
| 527 |
+
hovertemplate="<b>"+seg+"</b> ("+region+")<br>Purchases: %{x:.1f}/month<br>Avg spend: €%{y:.0f}<extra></extra>"))
|
| 528 |
+
fig.update_layout(**_ch(320,title))
|
| 529 |
+
fig.update_xaxes(**_xax(title="Purchases per month"))
|
| 530 |
+
fig.update_yaxes(**_yax(title="Average spend (€)"))
|
| 531 |
+
return fig
|
| 532 |
+
|
| 533 |
+
if fix:
|
| 534 |
+
c1,c2=st.columns(2)
|
| 535 |
+
c1.plotly_chart(_scatter(seg_b,"Before fix — biased"),use_container_width=True)
|
| 536 |
+
c2.plotly_chart(_scatter(seg_a,"After fix — fair"),use_container_width=True)
|
| 537 |
+
else:
|
| 538 |
+
st.plotly_chart(_scatter(seg_b,"Customer groups (no fix)"),use_container_width=True)
|
| 539 |
+
|
| 540 |
+
# ── bar chart ─────────────────────────────────────────────────────────
|
| 541 |
+
fig2=go.Figure()
|
| 542 |
+
fig2.add_trace(go.Bar(name="Before fix",x=["Urban → High Value","Rural → High Value"],
|
| 543 |
+
y=[ub,rb],marker_color=RED,
|
| 544 |
+
text=[f"{ub}%",f"{rb}%"],textposition="outside",textfont_color=FG))
|
| 545 |
+
if fix:
|
| 546 |
+
fig2.add_trace(go.Bar(name="After fix",x=["Urban → High Value","Rural → High Value"],
|
| 547 |
+
y=[ua,ra],marker_color=GREEN,
|
| 548 |
+
text=[f"{ua}%",f"{ra}%"],textposition="outside",textfont_color=FG))
|
| 549 |
+
fig2.update_layout(**_ch(260,"Percentage in High Value group"),barmode="group")
|
| 550 |
+
fig2.update_xaxes(**_xax()); fig2.update_yaxes(**_yax(title="%",range=[0,110]))
|
| 551 |
+
fig2.add_hline(y=min(ub,ua if fix else ub),line_color="#94a3b8",line_dash="dot",
|
| 552 |
+
annotation_text="Urban rate",annotation_font_color=MUTE)
|
| 553 |
+
st.plotly_chart(fig2,use_container_width=True)
|
| 554 |
+
|
| 555 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 556 |
+
# TASK 3
|
| 557 |
+
# ═════════════════════════════════════════════════════════════════════��════════
|
| 558 |
+
with T3:
|
| 559 |
+
st.markdown("### Watch the AI find the delivery route in real time")
|
| 560 |
+
st.caption("Pick start and end points, choose an algorithm, then replay how it explores the network step by step.")
|
| 561 |
+
|
| 562 |
+
ctrl3,map3=st.columns([1,3])
|
| 563 |
+
with ctrl3:
|
| 564 |
+
all_n=list(NODES.keys())
|
| 565 |
+
sn=st.selectbox("Start node",all_n,index=0)
|
| 566 |
+
en=st.selectbox("End node", all_n,index=19)
|
| 567 |
+
al=st.radio("Algorithm",["BFS","DFS","A*","IDA*"],index=2,
|
| 568 |
+
captions=["Level-by-level","Deep dive","Guided (best)","Memory-efficient"])
|
| 569 |
+
gr=st.toggle("Minimise CO₂ (not distance)",False)
|
| 570 |
+
st.divider()
|
| 571 |
+
adj=ADJ_CO2 if gr else ADJ_KM
|
| 572 |
+
unit="CO2" if gr else "km"
|
| 573 |
+
|
| 574 |
+
if sn==en:
|
| 575 |
+
st.warning("Choose different start and end."); path,cost,expl=[],0,[]; ms=0
|
| 576 |
+
else:
|
| 577 |
+
t0=time.perf_counter()
|
| 578 |
+
path,cost,expl=ALGOS[al](sn,en,adj)
|
| 579 |
+
ms=round((time.perf_counter()-t0)*1000,3)
|
| 580 |
+
if path:
|
| 581 |
+
st.metric("Route distance",f"{cost} {'km' if unit=='km' else 'kg CO₂'}")
|
| 582 |
+
st.metric("Nodes the AI checked",len(expl),help="The fewer the better — the AI was more efficient")
|
| 583 |
+
st.metric("Time taken",f"{ms} ms")
|
| 584 |
+
st.markdown(
|
| 585 |
+
f"<div class='tip'><b>Route:</b> {' → '.join(path)}</div>",
|
| 586 |
+
unsafe_allow_html=True)
|
| 587 |
+
else:
|
| 588 |
+
st.error("No route found."); path=[]; expl=[]
|
| 589 |
+
|
| 590 |
+
with map3:
|
| 591 |
+
# ── exploration replay slider ─────────────────────────────────────────
|
| 592 |
+
if expl:
|
| 593 |
+
replay=st.slider(
|
| 594 |
+
"🔍 Replay: drag to see how the AI explored the map",
|
| 595 |
+
0,len(expl),len(expl),
|
| 596 |
+
help="0 = no exploration shown, max = full path found")
|
| 597 |
+
explored_so_far=set(expl[:replay])
|
| 598 |
+
pct=int(replay/len(expl)*100) if expl else 100
|
| 599 |
+
st.markdown(
|
| 600 |
+
f"<div style='font-size:.82rem;color:{MUTE};margin-bottom:4px'>"
|
| 601 |
+
f"<span class='badge-blue'>{replay}/{len(expl)} nodes explored ({pct}%)</span>"
|
| 602 |
+
f"{' <span class=badge-green>Route found</span>' if replay==len(expl) and path else ''}"
|
| 603 |
+
f"</div>",unsafe_allow_html=True)
|
| 604 |
+
else:
|
| 605 |
+
explored_so_far=set()
|
| 606 |
+
|
| 607 |
+
fig_net=build_network(sn,en,path,explored_so_far,adj,unit,al)
|
| 608 |
+
st.plotly_chart(fig_net,use_container_width=True)
|
| 609 |
+
|
| 610 |
+
# colour legend
|
| 611 |
+
leg=st.columns(5)
|
| 612 |
+
leg[0].markdown(f"<div style='font-size:.78rem'>⬤ <span style='color:{RED}'>Urban node</span></div>",unsafe_allow_html=True)
|
| 613 |
+
leg[1].markdown(f"<div style='font-size:.78rem'>⬤ <span style='color:{GREEN}'>Rural node</span></div>",unsafe_allow_html=True)
|
| 614 |
+
leg[2].markdown(f"<div style='font-size:.78rem'>⬤ <span style='color:#bfdbfe'>Explored</span></div>",unsafe_allow_html=True)
|
| 615 |
+
leg[3].markdown(f"<div style='font-size:.78rem'>⬤ <span style='color:{AMBER}'>On path</span></div>",unsafe_allow_html=True)
|
| 616 |
+
leg[4].markdown(f"<div style='font-size:.78rem'>⬤ <span style='color:#fff;background:{FG};padding:1px 4px;border-radius:3px'>Start</span> / <span style='color:{FG};background:#facc15;padding:1px 4px;border-radius:3px'>End</span></div>",unsafe_allow_html=True)
|
| 617 |
+
|
| 618 |
+
# ── side-by-side comparison ───────────────────────────────────────────────
|
| 619 |
+
with st.expander("Compare all 4 algorithms on this route"):
|
| 620 |
+
if sn!=en:
|
| 621 |
+
rows=[]
|
| 622 |
+
for nm in ["BFS","DFS","A*","IDA*"]:
|
| 623 |
+
t0=time.perf_counter(); p,c,e=ALGOS[nm](sn,en,adj); ms2=(time.perf_counter()-t0)*1000
|
| 624 |
+
rows.append({"Algorithm":nm,
|
| 625 |
+
f"Distance ({'km' if unit=='km' else 'CO₂'})":round(c,2) if p else "N/A",
|
| 626 |
+
"Nodes checked":len(e),"Time (ms)":round(ms2,3),
|
| 627 |
+
"Finds shortest?":nm in ["A*","IDA*","BFS"]})
|
| 628 |
+
df_c=pd.DataFrame(rows)
|
| 629 |
+
st.dataframe(df_c,use_container_width=True,hide_index=True)
|
| 630 |
+
|
| 631 |
+
fc=make_subplots(rows=1,cols=2,subplot_titles=["Nodes checked (fewer = smarter)","Time (ms)"])
|
| 632 |
+
pal=[BLUE,RED,GREEN,PURPLE]
|
| 633 |
+
for col,ci in [("Nodes checked",1),("Time (ms)",2)]:
|
| 634 |
+
fc.add_trace(go.Bar(x=df_c["Algorithm"],y=df_c[col],marker_color=pal,
|
| 635 |
+
text=df_c[col],textposition="outside",textfont_color=FG,
|
| 636 |
+
showlegend=False),row=1,col=ci)
|
| 637 |
+
fc.update_layout(paper_bgcolor=SURF,plot_bgcolor=BG,font_color=FG,height=280,
|
| 638 |
+
margin=dict(l=40,r=20,t=50,b=30))
|
| 639 |
+
fc.update_xaxes(gridcolor=LINE); fc.update_yaxes(gridcolor=LINE)
|
| 640 |
+
st.plotly_chart(fc,use_container_width=True)
|
| 641 |
+
|
| 642 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 643 |
+
# TASK 4
|
| 644 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 645 |
+
with T4:
|
| 646 |
+
st.markdown("### Head-to-head: A* vs IDA* on real delivery routes")
|
| 647 |
+
st.caption("We run both algorithms on 10 routes and measure speed and efficiency. Results appear as they complete.")
|
| 648 |
+
|
| 649 |
+
c1,c2=st.columns([1,3])
|
| 650 |
+
with c1:
|
| 651 |
+
nruns=st.slider("Timing runs per route",5,30,20,5)
|
| 652 |
+
go_btn=st.button("▶ Run the test",type="primary",use_container_width=True)
|
| 653 |
+
st.markdown("""
|
| 654 |
+
<div class='tip'>
|
| 655 |
+
<b>A*</b> keeps an open list in memory — very fast to find a path, but uses more RAM.<br><br>
|
| 656 |
+
<b>IDA*</b> uses almost no memory — it re-searches with a tighter limit each time. Slower here but scales to huge networks.
|
| 657 |
+
</div>""",unsafe_allow_html=True)
|
| 658 |
+
|
| 659 |
+
with c2:
|
| 660 |
+
OD_U=[("U1","U10"),("U7","U6"),("U2","U9"),("U1","U9"),("U3","U8")]
|
| 661 |
+
OD_R=[("R1","R9"),("R2","R8"),("R3","R10"),("R1","R6"),("R4","R9")]
|
| 662 |
+
|
| 663 |
+
if go_btn:
|
| 664 |
+
rows=[]; chart_ph=st.empty(); prog=st.progress(0); status_ph=st.empty()
|
| 665 |
+
total=(len(OD_U)+len(OD_R))*2; done=0
|
| 666 |
+
|
| 667 |
+
for zone,pairs in [("Urban",OD_U),("Rural",OD_R)]:
|
| 668 |
+
for s,g in pairs:
|
| 669 |
+
for nm,fn in [("A*",astar),("IDA*",ida_star)]:
|
| 670 |
+
times=[]
|
| 671 |
+
p=c3=None; e=[]
|
| 672 |
+
for _ in range(nruns):
|
| 673 |
+
t0=time.perf_counter(); p,c3,e=fn(s,g,ADJ_KM)
|
| 674 |
+
times.append((time.perf_counter()-t0)*1000)
|
| 675 |
+
rows.append({"Zone":zone,"Route":f"{s}→{g}","Algorithm":nm,
|
| 676 |
+
"Distance (km)":c3,"Nodes checked":len(e),
|
| 677 |
+
"Avg time (ms)":round(sum(times)/len(times),3)})
|
| 678 |
+
done+=1; prog.progress(done/total)
|
| 679 |
+
status_ph.markdown(
|
| 680 |
+
f"<span class='badge-blue'>Testing {s}→{g} with {nm}...</span>",
|
| 681 |
+
unsafe_allow_html=True)
|
| 682 |
+
|
| 683 |
+
# live chart update
|
| 684 |
+
if len(rows)>=2:
|
| 685 |
+
df_live=pd.DataFrame(rows)
|
| 686 |
+
sm=df_live.groupby(["Zone","Algorithm"])[["Nodes checked","Avg time (ms)"]].mean().reset_index()
|
| 687 |
+
fl=make_subplots(rows=1,cols=2,
|
| 688 |
+
subplot_titles=["Avg nodes checked","Avg time (ms)"])
|
| 689 |
+
for anm,acl in [("A*",BLUE),("IDA*",PURPLE)]:
|
| 690 |
+
sub=sm[sm.Algorithm==anm]
|
| 691 |
+
if sub.empty: continue
|
| 692 |
+
for key,ci in [("Nodes checked",1),("Avg time (ms)",2)]:
|
| 693 |
+
fl.add_trace(go.Bar(name=anm,x=sub["Zone"],y=sub[key].round(2),
|
| 694 |
+
marker_color=acl,showlegend=(ci==1),
|
| 695 |
+
text=sub[key].round(2),textposition="outside",
|
| 696 |
+
textfont_color=FG),row=1,col=ci)
|
| 697 |
+
fl.update_layout(paper_bgcolor=SURF,plot_bgcolor=BG,font_color=FG,
|
| 698 |
+
barmode="group",height=320,
|
| 699 |
+
margin=dict(l=40,r=20,t=50,b=30),
|
| 700 |
+
legend=dict(bgcolor=SURF,bordercolor=LINE))
|
| 701 |
+
fl.update_xaxes(gridcolor=LINE); fl.update_yaxes(gridcolor=LINE)
|
| 702 |
+
chart_ph.plotly_chart(fl,use_container_width=True)
|
| 703 |
+
|
| 704 |
+
prog.empty(); status_ph.empty()
|
| 705 |
+
df_b=pd.DataFrame(rows)
|
| 706 |
+
st.dataframe(df_b,use_container_width=True,hide_index=True)
|
| 707 |
+
|
| 708 |
+
ae=df_b[df_b.Algorithm=="A*"]["Nodes checked"].mean()
|
| 709 |
+
ie=df_b[df_b.Algorithm=="IDA*"]["Nodes checked"].mean()
|
| 710 |
+
at=df_b[df_b.Algorithm=="A*"]["Avg time (ms)"].mean()
|
| 711 |
+
it=df_b[df_b.Algorithm=="IDA*"]["Avg time (ms)"].mean()
|
| 712 |
+
winner="A*" if at<it else "IDA*"
|
| 713 |
+
st.success(
|
| 714 |
+
f"**Result:** A* checked {ae:.0f} nodes on average vs IDA*'s {ie:.0f}. "
|
| 715 |
+
f"**{winner}** was faster on this map ({at:.3f} ms vs {it:.3f} ms). "
|
| 716 |
+
f"On a national road network with millions of junctions, IDA*'s near-zero memory use makes it the only practical choice.")
|
| 717 |
+
else:
|
| 718 |
+
st.info("Click **▶ Run the test** — the chart will build live as results come in.")
|
| 719 |
+
|
| 720 |
+
# ══════════════════════════════════════════════════════���═══════════════════════
|
| 721 |
+
# TASK 5
|
| 722 |
+
# ══════════════════════════════════════════════════════════════════════════════
|
| 723 |
+
with T5:
|
| 724 |
+
st.markdown("### Predicting EcoCart's daily sales with machine learning")
|
| 725 |
+
st.caption("Two models trained on 2 years of data. Adjust settings and the chart updates instantly.")
|
| 726 |
+
|
| 727 |
+
ctrl5,main5=st.columns([1,3])
|
| 728 |
+
with ctrl5:
|
| 729 |
+
tp=st.slider("Training data",60,90,80,5,format="%d%%")
|
| 730 |
+
ne=st.slider("Random Forest trees",50,300,200,50)
|
| 731 |
+
show5=st.radio("Show",["Both","Linear Regression","Random Forest"])
|
| 732 |
+
st.divider()
|
| 733 |
+
st.markdown("<div class='section-label'>Try your own prediction</div>",unsafe_allow_html=True)
|
| 734 |
+
st.markdown("<div class='tip'>Set values for any day and see what the model predicts.</div>",
|
| 735 |
+
unsafe_allow_html=True)
|
| 736 |
+
wi_dow=st.selectbox("Day of week",["Mon","Tue","Wed","Thu","Fri","Sat","Sun"],index=4)
|
| 737 |
+
wi_month=st.selectbox("Month",["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"],index=0)
|
| 738 |
+
wi_promo=st.toggle("Promotion running today?",False)
|
| 739 |
+
wi_lag1=st.number_input("Yesterday's sales",min_value=50,max_value=300,value=120,step=5)
|
| 740 |
+
wi_lag7=st.number_input("Sales 7 days ago", min_value=50,max_value=300,value=115,step=5)
|
| 741 |
+
|
| 742 |
+
with main5:
|
| 743 |
+
with st.spinner("Training models…"):
|
| 744 |
+
lr_o,rf_o,te_df,lp,rp,imps=_train(tp,ne)
|
| 745 |
+
|
| 746 |
+
y=te_df["sales"].values; dates=te_df["date"].values
|
| 747 |
+
|
| 748 |
+
lmae,lrmse,lr2,lmape=_met(y,lp)
|
| 749 |
+
rmae,rrmse,rr2,rmape=_met(y,rp)
|
| 750 |
+
|
| 751 |
+
mc=st.columns(4)
|
| 752 |
+
mc[0].metric("Linear Reg accuracy (R²)",lr2)
|
| 753 |
+
mc[1].metric("Linear Reg avg error",f"±{lmae} units")
|
| 754 |
+
mc[2].metric("Random Forest accuracy (R²)",rr2,delta=f"{rr2-lr2:+.3f}")
|
| 755 |
+
mc[3].metric("Random Forest avg error",f"±{rmae} units",delta=f"{rmae-lmae:+.1f}")
|
| 756 |
+
|
| 757 |
+
# ── what-if prediction ────────────────────────────────────────────────
|
| 758 |
+
dow_map={"Mon":0,"Tue":1,"Wed":2,"Thu":3,"Fri":4,"Sat":5,"Sun":6}
|
| 759 |
+
mon_map={"Jan":1,"Feb":2,"Mar":3,"Apr":4,"May":5,"Jun":6,
|
| 760 |
+
"Jul":7,"Aug":8,"Sep":9,"Oct":10,"Nov":11,"Dec":12}
|
| 761 |
+
wi_doy=int((mon_map[wi_month]-1)*30.4+15)
|
| 762 |
+
wi_r7=round((wi_lag1+wi_lag7)/2)
|
| 763 |
+
wi_r30=round((wi_lag1+wi_lag7)/2)
|
| 764 |
+
wi_row=[[dow_map[wi_dow],mon_map[wi_month],wi_doy,int(wi_promo),
|
| 765 |
+
wi_lag1,wi_lag7,wi_lag7,wi_r7,wi_r30]]
|
| 766 |
+
wi_pred_rf=round(rf_o.predict(wi_row)[0],0)
|
| 767 |
+
wi_pred_lr=round(lr_o.predict(wi_row)[0],0)
|
| 768 |
+
|
| 769 |
+
wc=st.columns(3)
|
| 770 |
+
wc[0].markdown(
|
| 771 |
+
f"<div style='background:#fff;border-radius:10px;padding:14px 18px;"
|
| 772 |
+
f"box-shadow:0 1px 4px rgba(0,0,0,.07);text-align:center'>"
|
| 773 |
+
f"<div style='font-size:.78rem;color:{MUTE}'>Your scenario prediction</div>"
|
| 774 |
+
f"<div style='font-size:1.6rem;font-weight:700;color:{GREEN}'>{int(wi_pred_rf)}</div>"
|
| 775 |
+
f"<div style='font-size:.78rem;color:{MUTE}'>units (Random Forest)</div></div>",
|
| 776 |
+
unsafe_allow_html=True)
|
| 777 |
+
wc[1].markdown(
|
| 778 |
+
f"<div style='background:#fff;border-radius:10px;padding:14px 18px;"
|
| 779 |
+
f"box-shadow:0 1px 4px rgba(0,0,0,.07);text-align:center'>"
|
| 780 |
+
f"<div style='font-size:.78rem;color:{MUTE}'>Linear Regression says</div>"
|
| 781 |
+
f"<div style='font-size:1.6rem;font-weight:700;color:{BLUE}'>{int(wi_pred_lr)}</div>"
|
| 782 |
+
f"<div style='font-size:.78rem;color:{MUTE}'>units</div></div>",
|
| 783 |
+
unsafe_allow_html=True)
|
| 784 |
+
wc[2].markdown(
|
| 785 |
+
f"<div style='background:#fff;border-radius:10px;padding:14px 18px;"
|
| 786 |
+
f"box-shadow:0 1px 4px rgba(0,0,0,.07);text-align:center'>"
|
| 787 |
+
f"<div style='font-size:.78rem;color:{MUTE}'>Promotion boost</div>"
|
| 788 |
+
f"<div style='font-size:1.6rem;font-weight:700;color:{AMBER}'>{'Yes +~40' if wi_promo else 'None'}</div>"
|
| 789 |
+
f"<div style='font-size:.78rem;color:{MUTE}'>estimated extra units</div></div>",
|
| 790 |
+
unsafe_allow_html=True)
|
| 791 |
+
|
| 792 |
+
st.markdown(" ")
|
| 793 |
+
|
| 794 |
+
# ── forecast chart with range selector ───────────────────────────────
|
| 795 |
+
fig5=go.Figure()
|
| 796 |
+
fig5.add_trace(go.Scatter(x=dates,y=y,name="Actual sales",
|
| 797 |
+
line=dict(color=FG,width=1.5),opacity=.85,
|
| 798 |
+
hovertemplate="<b>Actual</b><br>%{x|%d %b %Y}<br>%{y:.0f} units<extra></extra>"))
|
| 799 |
+
if show5 in ("Both","Linear Regression"):
|
| 800 |
+
fig5.add_trace(go.Scatter(x=dates,y=lp,name="Linear Regression",
|
| 801 |
+
line=dict(color=BLUE,width=1.5,dash="dot"),
|
| 802 |
+
hovertemplate="<b>LR Prediction</b><br>%{x|%d %b %Y}<br>%{y:.0f} units<extra></extra>"))
|
| 803 |
+
if show5 in ("Both","Random Forest"):
|
| 804 |
+
fig5.add_trace(go.Scatter(x=dates,y=rp,name="Random Forest",
|
| 805 |
+
line=dict(color=GREEN,width=1.5),
|
| 806 |
+
hovertemplate="<b>RF Prediction</b><br>%{x|%d %b %Y}<br>%{y:.0f} units<extra></extra>"))
|
| 807 |
+
fig5.update_layout(**_ch(360,f"Actual vs predicted — test set ({100-tp}% of data)"))
|
| 808 |
+
fig5.update_xaxes(**_xax(title="Date",
|
| 809 |
+
rangeselector=dict(
|
| 810 |
+
bgcolor=SURF,
|
| 811 |
+
buttons=[dict(count=30,label="30d",step="day",stepmode="backward"),
|
| 812 |
+
dict(count=60,label="60d",step="day",stepmode="backward"),
|
| 813 |
+
dict(count=90,label="90d",step="day",stepmode="backward"),
|
| 814 |
+
dict(step="all",label="All")])))
|
| 815 |
+
fig5.update_yaxes(**_yax(title="Units sold"))
|
| 816 |
+
st.plotly_chart(fig5,use_container_width=True)
|
| 817 |
+
|
| 818 |
+
r_col,i_col=st.columns(2)
|
| 819 |
+
with r_col:
|
| 820 |
+
fig_r=go.Figure()
|
| 821 |
+
if show5 in ("Both","Linear Regression"):
|
| 822 |
+
fig_r.add_trace(go.Scatter(x=lp,y=y-lp,mode="markers",name="Linear Reg",
|
| 823 |
+
marker=dict(color=BLUE,size=5,opacity=.5),
|
| 824 |
+
hovertemplate="Predicted %{x:.0f}<br>Error %{y:.0f} units<extra></extra>"))
|
| 825 |
+
if show5 in ("Both","Random Forest"):
|
| 826 |
+
fig_r.add_trace(go.Scatter(x=rp,y=y-rp,mode="markers",name="Random Forest",
|
| 827 |
+
marker=dict(color=GREEN,size=5,opacity=.5),
|
| 828 |
+
hovertemplate="Predicted %{x:.0f}<br>Error %{y:.0f} units<extra></extra>"))
|
| 829 |
+
fig_r.add_hline(y=0,line_color="#94a3b8",line_width=1.5,line_dash="dash")
|
| 830 |
+
fig_r.update_layout(**_ch(280,"Prediction errors (closer to 0 = better)"))
|
| 831 |
+
fig_r.update_xaxes(**_xax(title="Predicted units"))
|
| 832 |
+
fig_r.update_yaxes(**_yax(title="Error (actual − predicted)"))
|
| 833 |
+
st.plotly_chart(fig_r,use_container_width=True)
|
| 834 |
+
|
| 835 |
+
with i_col:
|
| 836 |
+
imp=pd.Series(imps,index=FEATS).sort_values()
|
| 837 |
+
fi=go.Figure(go.Bar(
|
| 838 |
+
x=imp.values,
|
| 839 |
+
y=[FEAT_LABELS.get(i,i) for i in imp.index],
|
| 840 |
+
orientation="h",
|
| 841 |
+
marker=dict(color=imp.values,colorscale=[[0,"#d1fae5"],[1,GREEN]],showscale=False),
|
| 842 |
+
text=[f"{v:.3f}" for v in imp.values],
|
| 843 |
+
textposition="outside",textfont_color=FG,
|
| 844 |
+
hovertemplate="%{y}<br>Importance: %{x:.3f}<extra></extra>"))
|
| 845 |
+
fi.update_layout(**_ch(280,"What does the model rely on most?"))
|
| 846 |
+
fi.update_xaxes(**_xax(title="Importance score"))
|
| 847 |
+
fi.update_yaxes(**_yax())
|
| 848 |
+
st.plotly_chart(fi,use_container_width=True)
|
| 849 |
+
|
| 850 |
+
winner="Random Forest" if rr2>=lr2 else "Linear Regression"
|
| 851 |
+
st.success(
|
| 852 |
+
f"**{winner}** is more accurate (R² = {max(lr2,rr2):.3f}). "
|
| 853 |
+
f"The top predictor is **{FEAT_LABELS['lag_7']}** — because the same weekday last week "
|
| 854 |
+
f"is the single best baseline for today's sales.")
|
| 855 |
+
|
| 856 |
+
with st.expander("See raw prediction data"):
|
| 857 |
+
st.dataframe(pd.DataFrame({"Date":dates,"Actual":y.round(1),
|
| 858 |
+
"LR Prediction":lp.round(1),"RF Prediction":rp.round(1),
|
| 859 |
+
"LR Error":(y-lp).round(1),"RF Error":(y-rp).round(1)}),
|
| 860 |
+
use_container_width=True)
|
requirements.txt
ADDED
|
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
streamlit>=1.57.0
|
| 2 |
+
numpy>=2.0.0
|
| 3 |
+
pandas>=2.0.0
|
| 4 |
+
plotly>=5.0.0
|
| 5 |
+
scikit-learn>=1.3.0
|
task2_segmentation.py
ADDED
|
@@ -0,0 +1,240 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
EcoCart Customer Segmentation — Bias Detection & Mitigation
|
| 3 |
+
Task 2 — Demonstrates urban-rural bias in K-Means segmentation and
|
| 4 |
+
applies reweighing to fix it.
|
| 5 |
+
|
| 6 |
+
NCI MSCAI | Fundamentals of AI TABA 2026
|
| 7 |
+
|
| 8 |
+
Run: python3 task2_segmentation.py
|
| 9 |
+
Out: bias_before_after.png, disparate_impact.png
|
| 10 |
+
"""
|
| 11 |
+
|
| 12 |
+
import numpy as np
|
| 13 |
+
import pandas as pd
|
| 14 |
+
import matplotlib.pyplot as plt
|
| 15 |
+
from sklearn.cluster import KMeans
|
| 16 |
+
from sklearn.preprocessing import StandardScaler
|
| 17 |
+
|
| 18 |
+
RNG = np.random.default_rng(42)
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
# ── 1. Generate biased customer data ────────────────────────
|
| 22 |
+
# Urban customers have more data, higher frequency, higher spend — mimicking
|
| 23 |
+
# a real scenario where the platform launched in cities first.
|
| 24 |
+
|
| 25 |
+
def generate_biased_data(n_urban=300, n_rural=100):
|
| 26 |
+
# Urban: higher frequency and spend on average
|
| 27 |
+
urban = pd.DataFrame({
|
| 28 |
+
"freq": RNG.normal(6.0, 2.0, n_urban).clip(0.5),
|
| 29 |
+
"spend": RNG.normal(120, 40, n_urban).clip(10),
|
| 30 |
+
"recency": RNG.exponential(10, n_urban).clip(1, 90),
|
| 31 |
+
"region": "urban",
|
| 32 |
+
})
|
| 33 |
+
# Rural: lower frequency and spend (platform is newer there)
|
| 34 |
+
rural = pd.DataFrame({
|
| 35 |
+
"freq": RNG.normal(3.0, 1.5, n_rural).clip(0.5),
|
| 36 |
+
"spend": RNG.normal(65, 30, n_rural).clip(10),
|
| 37 |
+
"recency": RNG.exponential(15, n_rural).clip(1, 90),
|
| 38 |
+
"region": "rural",
|
| 39 |
+
})
|
| 40 |
+
df = pd.concat([urban, rural], ignore_index=True)
|
| 41 |
+
df["freq"] = df["freq"].round(1)
|
| 42 |
+
df["spend"] = df["spend"].round(0)
|
| 43 |
+
df["recency"] = df["recency"].round(0)
|
| 44 |
+
return df
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
# ── 2. Segment with K-Means ────────────────────────────────
|
| 48 |
+
def segment(df, features=["freq", "spend", "recency"]):
|
| 49 |
+
scaler = StandardScaler()
|
| 50 |
+
X = scaler.fit_transform(df[features])
|
| 51 |
+
km = KMeans(n_clusters=3, random_state=42, n_init=10)
|
| 52 |
+
df = df.copy()
|
| 53 |
+
df["cluster"] = km.fit_predict(X)
|
| 54 |
+
|
| 55 |
+
# Label clusters by mean spend (High/Medium/Low)
|
| 56 |
+
means = df.groupby("cluster")["spend"].mean().sort_values(ascending=False)
|
| 57 |
+
label_map = {means.index[0]: "High Value",
|
| 58 |
+
means.index[1]: "Medium",
|
| 59 |
+
means.index[2]: "Low Value"}
|
| 60 |
+
df["segment"] = df["cluster"].map(label_map)
|
| 61 |
+
return df
|
| 62 |
+
|
| 63 |
+
|
| 64 |
+
# ── 3. Bias metrics ────────────────────────────────────────
|
| 65 |
+
def compute_fairness(df):
|
| 66 |
+
urban = df[df.region == "urban"]
|
| 67 |
+
rural = df[df.region == "rural"]
|
| 68 |
+
u_high = (urban.segment == "High Value").mean()
|
| 69 |
+
r_high = (rural.segment == "High Value").mean()
|
| 70 |
+
di = r_high / u_high if u_high > 0 else 0
|
| 71 |
+
return {
|
| 72 |
+
"urban_high_pct": round(u_high * 100, 1),
|
| 73 |
+
"rural_high_pct": round(r_high * 100, 1),
|
| 74 |
+
"disparate_impact": round(di, 3),
|
| 75 |
+
"fair": di >= 0.8,
|
| 76 |
+
}
|
| 77 |
+
|
| 78 |
+
|
| 79 |
+
# ── 4. Mitigation: reweigh + balanced re-sample ────────────
|
| 80 |
+
def mitigate(df):
|
| 81 |
+
"""
|
| 82 |
+
Fix 1: Balance the dataset by oversampling rural customers.
|
| 83 |
+
Fix 2: Add a 'distance_adjusted_spend' feature that normalises
|
| 84 |
+
spend by delivery cost (rural customers pay more for delivery,
|
| 85 |
+
so their raw spend understates their purchase intent).
|
| 86 |
+
Fix 3: Post-processing — reassign borderline rural customers using
|
| 87 |
+
a lowered threshold derived from the rural spend distribution.
|
| 88 |
+
"""
|
| 89 |
+
df = df.copy()
|
| 90 |
+
|
| 91 |
+
# Oversample rural to match urban count
|
| 92 |
+
rural = df[df.region == "rural"]
|
| 93 |
+
urban = df[df.region == "urban"]
|
| 94 |
+
rural_up = rural.sample(n=len(urban), replace=True, random_state=42)
|
| 95 |
+
balanced = pd.concat([urban, rural_up], ignore_index=True)
|
| 96 |
+
|
| 97 |
+
# Adjust spend: rural delivery costs ~€12 more on average
|
| 98 |
+
balanced["adj_spend"] = balanced.apply(
|
| 99 |
+
lambda r: r["spend"] + 12 if r["region"] == "rural" else r["spend"],
|
| 100 |
+
axis=1,
|
| 101 |
+
)
|
| 102 |
+
# Adjust frequency: rural customers batch orders
|
| 103 |
+
balanced["adj_freq"] = balanced.apply(
|
| 104 |
+
lambda r: r["freq"] * 1.5 if r["region"] == "rural" else r["freq"],
|
| 105 |
+
axis=1,
|
| 106 |
+
)
|
| 107 |
+
|
| 108 |
+
# Re-segment on adjusted features
|
| 109 |
+
scaler = StandardScaler()
|
| 110 |
+
X = scaler.fit_transform(balanced[["adj_freq", "adj_spend", "recency"]])
|
| 111 |
+
km = KMeans(n_clusters=3, random_state=42, n_init=10)
|
| 112 |
+
balanced["cluster"] = km.fit_predict(X)
|
| 113 |
+
means = balanced.groupby("cluster")["adj_spend"].mean().sort_values(ascending=False)
|
| 114 |
+
label_map = {means.index[0]: "High Value",
|
| 115 |
+
means.index[1]: "Medium",
|
| 116 |
+
means.index[2]: "Low Value"}
|
| 117 |
+
balanced["segment"] = balanced["cluster"].map(label_map)
|
| 118 |
+
|
| 119 |
+
# Post-processing: promote top rural "Medium" and "Low Value" customers
|
| 120 |
+
# to "High Value" until disparate impact reaches 0.85 (above 0.8 threshold)
|
| 121 |
+
rural_mask = balanced.region == "rural"
|
| 122 |
+
urban_mask = balanced.region == "urban"
|
| 123 |
+
urban_high_rate = (balanced[urban_mask].segment == "High Value").mean()
|
| 124 |
+
target_rate = urban_high_rate * 0.85
|
| 125 |
+
n_rural = rural_mask.sum()
|
| 126 |
+
target_rural_high = int(target_rate * n_rural)
|
| 127 |
+
current_rural_high = ((balanced[rural_mask].segment == "High Value")).sum()
|
| 128 |
+
need = target_rural_high - current_rural_high
|
| 129 |
+
|
| 130 |
+
if need > 0:
|
| 131 |
+
# Promote from Medium first, then Low Value
|
| 132 |
+
candidates = balanced[rural_mask & (balanced.segment != "High Value")]
|
| 133 |
+
if len(candidates) > 0:
|
| 134 |
+
promote = candidates.nlargest(min(need, len(candidates)), "adj_spend").index
|
| 135 |
+
balanced.loc[promote, "segment"] = "High Value"
|
| 136 |
+
|
| 137 |
+
return balanced
|
| 138 |
+
|
| 139 |
+
|
| 140 |
+
# ── 5. Plots ────────────────────────────────────────────────
|
| 141 |
+
SEG_COLORS = {"High Value": "#10b981", "Medium": "#f59e0b", "Low Value": "#ef4444"}
|
| 142 |
+
|
| 143 |
+
def plot_before_after(before_df, after_df, before_fair, after_fair):
|
| 144 |
+
fig, axes = plt.subplots(1, 2, figsize=(14, 5.5))
|
| 145 |
+
fig.patch.set_facecolor("#0d1117")
|
| 146 |
+
|
| 147 |
+
for ax, df, fair, title in [
|
| 148 |
+
(axes[0], before_df, before_fair, "BEFORE mitigation (biased)"),
|
| 149 |
+
(axes[1], after_df, after_fair, "AFTER mitigation (reweighed + adjusted)"),
|
| 150 |
+
]:
|
| 151 |
+
ax.set_facecolor("#0d1117")
|
| 152 |
+
for seg in ["High Value", "Medium", "Low Value"]:
|
| 153 |
+
mask = df.segment == seg
|
| 154 |
+
for region, marker in [("urban", "o"), ("rural", "^")]:
|
| 155 |
+
rmask = mask & (df.region == region)
|
| 156 |
+
ax.scatter(df.loc[rmask, "freq"], df.loc[rmask, "spend"],
|
| 157 |
+
c=SEG_COLORS[seg], marker=marker, s=25, alpha=0.6,
|
| 158 |
+
label=f"{seg} ({region})" if ax == axes[0] else None)
|
| 159 |
+
di = fair["disparate_impact"]
|
| 160 |
+
color = "#ef4444" if not fair["fair"] else "#10b981"
|
| 161 |
+
ax.set_title(f"{title}\nDI = {di:.3f} {'⚠ BIASED' if not fair['fair'] else '✓ FAIR'}",
|
| 162 |
+
color="white", fontsize=11)
|
| 163 |
+
ax.set_xlabel("Purchase frequency / month", color="white")
|
| 164 |
+
ax.set_ylabel("Avg spend (€)", color="white")
|
| 165 |
+
ax.tick_params(colors="white")
|
| 166 |
+
ax.grid(True, alpha=0.1, color="white")
|
| 167 |
+
|
| 168 |
+
axes[0].legend(fontsize=7, facecolor="#0d1117", edgecolor="#334155",
|
| 169 |
+
labelcolor="white", loc="upper right", ncol=2)
|
| 170 |
+
plt.tight_layout()
|
| 171 |
+
plt.savefig("output/bias_before_after.png", dpi=150,
|
| 172 |
+
bbox_inches="tight", facecolor="#0d1117")
|
| 173 |
+
plt.close()
|
| 174 |
+
|
| 175 |
+
|
| 176 |
+
def plot_di(before_fair, after_fair):
|
| 177 |
+
fig, ax = plt.subplots(figsize=(8, 4))
|
| 178 |
+
fig.patch.set_facecolor("#0d1117")
|
| 179 |
+
ax.set_facecolor("#0d1117")
|
| 180 |
+
|
| 181 |
+
cats = ["Urban → High", "Rural → High", "Disparate Impact"]
|
| 182 |
+
before_vals = [before_fair["urban_high_pct"], before_fair["rural_high_pct"],
|
| 183 |
+
before_fair["disparate_impact"] * 100]
|
| 184 |
+
after_vals = [after_fair["urban_high_pct"], after_fair["rural_high_pct"],
|
| 185 |
+
after_fair["disparate_impact"] * 100]
|
| 186 |
+
|
| 187 |
+
x = range(len(cats))
|
| 188 |
+
w = 0.35
|
| 189 |
+
ax.bar([i - w/2 for i in x], before_vals, w, label="Before", color="#ef4444", alpha=0.85)
|
| 190 |
+
ax.bar([i + w/2 for i in x], after_vals, w, label="After", color="#10b981", alpha=0.85)
|
| 191 |
+
ax.axhline(80, color="#fbbf24", linewidth=1.5, linestyle="--", label="DI threshold (80%)")
|
| 192 |
+
ax.set_xticks(x)
|
| 193 |
+
ax.set_xticklabels(cats, color="white")
|
| 194 |
+
ax.set_ylabel("Percentage", color="white")
|
| 195 |
+
ax.set_title("Fairness metrics before vs after mitigation", color="white", fontsize=12)
|
| 196 |
+
ax.tick_params(colors="white")
|
| 197 |
+
ax.legend(fontsize=9, facecolor="#0d1117", edgecolor="#334155", labelcolor="white")
|
| 198 |
+
ax.grid(True, axis="y", alpha=0.15, color="white")
|
| 199 |
+
plt.tight_layout()
|
| 200 |
+
plt.savefig("output/disparate_impact.png", dpi=150,
|
| 201 |
+
bbox_inches="tight", facecolor="#0d1117")
|
| 202 |
+
plt.close()
|
| 203 |
+
|
| 204 |
+
|
| 205 |
+
# ── 6. Main ─────────────────────────────────────────────────
|
| 206 |
+
def main():
|
| 207 |
+
print("="*70)
|
| 208 |
+
print("EcoCart Customer Segmentation — Bias Detection & Mitigation")
|
| 209 |
+
print("="*70)
|
| 210 |
+
|
| 211 |
+
# Generate and segment (biased)
|
| 212 |
+
df = generate_biased_data()
|
| 213 |
+
df = segment(df)
|
| 214 |
+
before = compute_fairness(df)
|
| 215 |
+
print(f"\nBEFORE mitigation:")
|
| 216 |
+
print(f" Urban -> High Value: {before['urban_high_pct']}%")
|
| 217 |
+
print(f" Rural -> High Value: {before['rural_high_pct']}%")
|
| 218 |
+
print(f" Disparate Impact: {before['disparate_impact']}")
|
| 219 |
+
print(f" Fair (DI >= 0.8)? {before['fair']}")
|
| 220 |
+
|
| 221 |
+
print(f"\n Segment counts:")
|
| 222 |
+
ct = df.groupby(["region", "segment"]).size().unstack(fill_value=0)
|
| 223 |
+
print(ct.to_string(index=True))
|
| 224 |
+
|
| 225 |
+
# Mitigate
|
| 226 |
+
fixed = mitigate(df)
|
| 227 |
+
after = compute_fairness(fixed)
|
| 228 |
+
print(f"\nAFTER mitigation:")
|
| 229 |
+
print(f" Urban -> High Value: {after['urban_high_pct']}%")
|
| 230 |
+
print(f" Rural -> High Value: {after['rural_high_pct']}%")
|
| 231 |
+
print(f" Disparate Impact: {after['disparate_impact']}")
|
| 232 |
+
print(f" Fair (DI >= 0.8)? {after['fair']}")
|
| 233 |
+
|
| 234 |
+
# Plots
|
| 235 |
+
plot_before_after(df, fixed, before, after)
|
| 236 |
+
plot_di(before, after)
|
| 237 |
+
print("\nWrote: bias_before_after.png, disparate_impact.png")
|
| 238 |
+
|
| 239 |
+
if __name__ == "__main__":
|
| 240 |
+
main()
|
task3_4_routing.py
ADDED
|
@@ -0,0 +1,333 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
EcoCart Route Optimisation Prototype
|
| 3 |
+
Tasks 3 & 4 — BFS, DFS, A*, IDA* on a weighted delivery network
|
| 4 |
+
+ Green Routing mode (CO2-weighted edges for sustainability)
|
| 5 |
+
|
| 6 |
+
NCI MSCAI | Fundamentals of AI TABA 2026
|
| 7 |
+
|
| 8 |
+
Run: python3 task3_4_routing.py
|
| 9 |
+
Out: network_map.png, algo_comparison.png, green_vs_fast.png
|
| 10 |
+
"""
|
| 11 |
+
|
| 12 |
+
import heapq, math, time, tracemalloc, statistics
|
| 13 |
+
from collections import deque
|
| 14 |
+
import matplotlib.pyplot as plt
|
| 15 |
+
import matplotlib.patches as mpatches
|
| 16 |
+
import networkx as nx
|
| 17 |
+
|
| 18 |
+
# ── 1. Network ──────────────────────────────────────────────
|
| 19 |
+
NODES = {
|
| 20 |
+
# Urban cluster (dense, short edges)
|
| 21 |
+
"U1":(1.0,1.0,"urban"),"U2":(2.0,1.5,"urban"),"U3":(3.0,1.0,"urban"),
|
| 22 |
+
"U4":(1.5,2.5,"urban"),"U5":(2.5,3.0,"urban"),"U6":(3.5,2.0,"urban"),
|
| 23 |
+
"U7":(1.0,3.5,"urban"),"U8":(2.0,4.0,"urban"),"U9":(3.0,4.0,"urban"),
|
| 24 |
+
"U10":(4.0,3.5,"urban"),
|
| 25 |
+
# Rural cluster (sparse, long edges)
|
| 26 |
+
"R1":(6.0,1.0,"rural"),"R2":(8.0,2.0,"rural"),"R3":(10.0,1.5,"rural"),
|
| 27 |
+
"R4":(7.0,4.0,"rural"),"R5":(9.0,4.5,"rural"),"R6":(11.0,3.5,"rural"),
|
| 28 |
+
"R7":(6.5,6.0,"rural"),"R8":(9.0,7.0,"rural"),"R9":(11.0,6.0,"rural"),
|
| 29 |
+
"R10":(8.0,5.5,"rural"),
|
| 30 |
+
}
|
| 31 |
+
|
| 32 |
+
def _dist(a, b):
|
| 33 |
+
return math.hypot(NODES[a][0]-NODES[b][0], NODES[a][1]-NODES[b][1])
|
| 34 |
+
|
| 35 |
+
_PAIRS = [
|
| 36 |
+
("U1","U2"),("U2","U3"),("U1","U4"),("U2","U4"),("U2","U5"),
|
| 37 |
+
("U3","U6"),("U4","U5"),("U5","U6"),("U4","U7"),("U5","U8"),
|
| 38 |
+
("U6","U10"),("U7","U8"),("U8","U9"),("U9","U10"),("U5","U9"),
|
| 39 |
+
("R1","R2"),("R2","R3"),("R1","R4"),("R2","R4"),("R3","R6"),
|
| 40 |
+
("R4","R5"),("R5","R6"),("R4","R7"),("R5","R10"),("R7","R10"),
|
| 41 |
+
("R7","R8"),("R8","R9"),("R6","R9"),("R8","R10"),("R5","R8"),
|
| 42 |
+
("U3","R1"),("U10","R4"),("U6","R1"),("U9","R7"),
|
| 43 |
+
]
|
| 44 |
+
|
| 45 |
+
# Road distance ≈ 1.15× straight-line
|
| 46 |
+
EDGES = [(a, b, round(_dist(a,b)*1.15, 2)) for a, b in _PAIRS]
|
| 47 |
+
|
| 48 |
+
# CO2 cost per edge: urban roads have traffic → higher emissions per km
|
| 49 |
+
# Rural roads: 0.12 kg CO2/km; Urban roads: 0.21 kg CO2/km
|
| 50 |
+
def _co2(a, b, km):
|
| 51 |
+
za, zb = NODES[a][2], NODES[b][2]
|
| 52 |
+
rate = 0.28 if za == "urban" and zb == "urban" else 0.18 if za != zb else 0.10
|
| 53 |
+
return round(km * rate, 3)
|
| 54 |
+
|
| 55 |
+
CO2_EDGES = [(a, b, _co2(a, b, w)) for a, b, w in EDGES]
|
| 56 |
+
|
| 57 |
+
ADJ_KM = {n: [] for n in NODES}
|
| 58 |
+
ADJ_CO2 = {n: [] for n in NODES}
|
| 59 |
+
for i, (a, b, w) in enumerate(EDGES):
|
| 60 |
+
ADJ_KM[a].append((b, w))
|
| 61 |
+
ADJ_KM[b].append((a, w))
|
| 62 |
+
co2 = CO2_EDGES[i][2]
|
| 63 |
+
ADJ_CO2[a].append((b, co2))
|
| 64 |
+
ADJ_CO2[b].append((a, co2))
|
| 65 |
+
|
| 66 |
+
# ── 2. Algorithms ───────────────────────────────────────────
|
| 67 |
+
def heuristic(n, goal, scale=1.0):
|
| 68 |
+
return _dist(n, goal) * scale
|
| 69 |
+
|
| 70 |
+
def bfs(start, goal, adj=ADJ_KM):
|
| 71 |
+
expanded = 0
|
| 72 |
+
q = deque([(start, [start])])
|
| 73 |
+
seen = {start}
|
| 74 |
+
while q:
|
| 75 |
+
node, path = q.popleft()
|
| 76 |
+
expanded += 1
|
| 77 |
+
if node == goal:
|
| 78 |
+
cost = sum(_edge_w(path[i], path[i+1], adj) for i in range(len(path)-1))
|
| 79 |
+
return path, round(cost, 2), expanded
|
| 80 |
+
for nb, _ in adj[node]:
|
| 81 |
+
if nb not in seen:
|
| 82 |
+
seen.add(nb)
|
| 83 |
+
q.append((nb, path + [nb]))
|
| 84 |
+
return None, math.inf, expanded
|
| 85 |
+
|
| 86 |
+
def dfs(start, goal, adj=ADJ_KM, depth_limit=50):
|
| 87 |
+
expanded = 0
|
| 88 |
+
stack = [(start, [start])]
|
| 89 |
+
seen = {start}
|
| 90 |
+
while stack:
|
| 91 |
+
node, path = stack.pop()
|
| 92 |
+
expanded += 1
|
| 93 |
+
if node == goal:
|
| 94 |
+
cost = sum(_edge_w(path[i], path[i+1], adj) for i in range(len(path)-1))
|
| 95 |
+
return path, round(cost, 2), expanded
|
| 96 |
+
if len(path) > depth_limit:
|
| 97 |
+
continue
|
| 98 |
+
for nb, _ in adj[node]:
|
| 99 |
+
if nb not in seen:
|
| 100 |
+
seen.add(nb)
|
| 101 |
+
stack.append((nb, path + [nb]))
|
| 102 |
+
return None, math.inf, expanded
|
| 103 |
+
|
| 104 |
+
def astar(start, goal, adj=ADJ_KM, h_scale=1.0):
|
| 105 |
+
expanded, counter = 0, 0
|
| 106 |
+
heap = [(heuristic(start, goal, h_scale), 0.0, counter, start, [start])]
|
| 107 |
+
best = {start: 0.0}
|
| 108 |
+
while heap:
|
| 109 |
+
f, g, _, node, path = heapq.heappop(heap)
|
| 110 |
+
if node == goal:
|
| 111 |
+
return path, round(g, 2), expanded
|
| 112 |
+
if g > best.get(node, math.inf):
|
| 113 |
+
continue
|
| 114 |
+
expanded += 1
|
| 115 |
+
for nb, w in adj[node]:
|
| 116 |
+
ng = g + w
|
| 117 |
+
if ng < best.get(nb, math.inf):
|
| 118 |
+
best[nb] = ng
|
| 119 |
+
counter += 1
|
| 120 |
+
heapq.heappush(heap, (ng + heuristic(nb, goal, h_scale), ng, counter, nb, path + [nb]))
|
| 121 |
+
return None, math.inf, expanded
|
| 122 |
+
|
| 123 |
+
def ida_star(start, goal, adj=ADJ_KM, h_scale=1.0):
|
| 124 |
+
expanded = [0]
|
| 125 |
+
def _dfs(node, g, bound, path, visited):
|
| 126 |
+
f = g + heuristic(node, goal, h_scale)
|
| 127 |
+
if f > bound:
|
| 128 |
+
return None, f
|
| 129 |
+
expanded[0] += 1
|
| 130 |
+
if node == goal:
|
| 131 |
+
return list(path), g
|
| 132 |
+
nxt = math.inf
|
| 133 |
+
for nb, w in adj[node]:
|
| 134 |
+
if nb in visited:
|
| 135 |
+
continue
|
| 136 |
+
visited.add(nb)
|
| 137 |
+
path.append(nb)
|
| 138 |
+
r, t = _dfs(nb, g + w, bound, path, visited)
|
| 139 |
+
if r is not None:
|
| 140 |
+
return r, t
|
| 141 |
+
if t < nxt:
|
| 142 |
+
nxt = t
|
| 143 |
+
path.pop()
|
| 144 |
+
visited.remove(nb)
|
| 145 |
+
return None, nxt
|
| 146 |
+
|
| 147 |
+
bound = heuristic(start, goal, h_scale)
|
| 148 |
+
while True:
|
| 149 |
+
r, t = _dfs(start, 0.0, bound, [start], {start})
|
| 150 |
+
if r is not None:
|
| 151 |
+
return r, round(t, 2), expanded[0]
|
| 152 |
+
if t == math.inf:
|
| 153 |
+
return None, math.inf, expanded[0]
|
| 154 |
+
bound = t
|
| 155 |
+
|
| 156 |
+
def _edge_w(a, b, adj):
|
| 157 |
+
for nb, w in adj[a]:
|
| 158 |
+
if nb == b:
|
| 159 |
+
return w
|
| 160 |
+
return math.inf
|
| 161 |
+
|
| 162 |
+
# ── 3. Benchmark ────────────────────────────────────────────
|
| 163 |
+
def benchmark(algo, start, goal, adj=ADJ_KM, repeats=20):
|
| 164 |
+
times, mems = [], []
|
| 165 |
+
path = cost = expanded = None
|
| 166 |
+
for _ in range(repeats):
|
| 167 |
+
tracemalloc.start()
|
| 168 |
+
t0 = time.perf_counter()
|
| 169 |
+
path, cost, expanded = algo(start, goal, adj)
|
| 170 |
+
t1 = time.perf_counter()
|
| 171 |
+
_, peak = tracemalloc.get_traced_memory()
|
| 172 |
+
tracemalloc.stop()
|
| 173 |
+
times.append((t1 - t0) * 1000)
|
| 174 |
+
mems.append(peak / 1024)
|
| 175 |
+
return {
|
| 176 |
+
"ms": round(statistics.mean(times), 3),
|
| 177 |
+
"kb": round(statistics.mean(mems), 2),
|
| 178 |
+
"expanded": expanded,
|
| 179 |
+
"cost": cost,
|
| 180 |
+
"path": path,
|
| 181 |
+
}
|
| 182 |
+
|
| 183 |
+
OD_URBAN = [("U1","U10"),("U7","U6"),("U2","U9"),("U1","U9"),("U3","U8")]
|
| 184 |
+
OD_RURAL = [("R1","R9"),("R2","R8"),("R3","R10"),("R1","R6"),("R4","R9")]
|
| 185 |
+
|
| 186 |
+
# ── 4. Plots ────────────────────────────────────────────────
|
| 187 |
+
def plot_network():
|
| 188 |
+
G = nx.Graph()
|
| 189 |
+
for n, (x, y, _) in NODES.items():
|
| 190 |
+
G.add_node(n, pos=(x, y))
|
| 191 |
+
for a, b, w in EDGES:
|
| 192 |
+
G.add_edge(a, b, weight=w)
|
| 193 |
+
pos = {n: (NODES[n][0], NODES[n][1]) for n in NODES}
|
| 194 |
+
colors = ["#ef4444" if NODES[n][2] == "urban" else "#10b981" for n in NODES]
|
| 195 |
+
|
| 196 |
+
fig, ax = plt.subplots(figsize=(13, 6))
|
| 197 |
+
ax.set_facecolor("#0d1117")
|
| 198 |
+
fig.patch.set_facecolor("#0d1117")
|
| 199 |
+
nx.draw(G, pos, ax=ax, with_labels=True, node_color=colors, node_size=500,
|
| 200 |
+
font_size=8, font_weight="bold", font_color="white",
|
| 201 |
+
edge_color="#334155", width=1.2)
|
| 202 |
+
labels = {(a, b): f"{w}" for a, b, w in EDGES}
|
| 203 |
+
nx.draw_networkx_edge_labels(G, pos, ax=ax, edge_labels=labels,
|
| 204 |
+
font_size=6, font_color="#94a3b8")
|
| 205 |
+
urban_patch = mpatches.Patch(color="#ef4444", label="Urban node")
|
| 206 |
+
rural_patch = mpatches.Patch(color="#10b981", label="Rural node")
|
| 207 |
+
ax.legend(handles=[urban_patch, rural_patch], loc="upper left",
|
| 208 |
+
fontsize=9, facecolor="#0d1117", edgecolor="#334155", labelcolor="white")
|
| 209 |
+
ax.set_title("EcoCart 20-node delivery network (edge labels = km)",
|
| 210 |
+
color="white", fontsize=12, pad=12)
|
| 211 |
+
plt.tight_layout()
|
| 212 |
+
plt.savefig("output/network_map.png", dpi=150, bbox_inches="tight",
|
| 213 |
+
facecolor="#0d1117")
|
| 214 |
+
plt.close()
|
| 215 |
+
|
| 216 |
+
|
| 217 |
+
def plot_comparison(results):
|
| 218 |
+
metrics = [("Runtime (ms)", "ms"), ("Nodes expanded", "expanded"), ("Peak memory (KB)", "kb")]
|
| 219 |
+
fig, axes = plt.subplots(1, 3, figsize=(15, 4.5))
|
| 220 |
+
fig.patch.set_facecolor("#0d1117")
|
| 221 |
+
for ax, (title, key) in zip(axes, metrics):
|
| 222 |
+
ax.set_facecolor("#0d1117")
|
| 223 |
+
u_a = statistics.mean(r["astar"][key] for r in results["urban"])
|
| 224 |
+
u_i = statistics.mean(r["ida"][key] for r in results["urban"])
|
| 225 |
+
r_a = statistics.mean(r["astar"][key] for r in results["rural"])
|
| 226 |
+
r_i = statistics.mean(r["ida"][key] for r in results["rural"])
|
| 227 |
+
x = [0, 1]
|
| 228 |
+
w = 0.32
|
| 229 |
+
ax.bar([xi - w/2 for xi in x], [u_a, r_a], w, label="A*", color="#3b82f6")
|
| 230 |
+
ax.bar([xi + w/2 for xi in x], [u_i, r_i], w, label="IDA*", color="#8b5cf6")
|
| 231 |
+
ax.set_xticks(x)
|
| 232 |
+
ax.set_xticklabels(["Urban", "Rural"], color="white")
|
| 233 |
+
ax.set_title(title, color="white", fontsize=11)
|
| 234 |
+
ax.tick_params(colors="white")
|
| 235 |
+
ax.grid(True, axis="y", alpha=0.15, color="white")
|
| 236 |
+
ax.legend(fontsize=9, facecolor="#0d1117", edgecolor="#334155", labelcolor="white")
|
| 237 |
+
plt.suptitle("A* vs IDA* (mean over 5 O-D pairs × 20 runs)",
|
| 238 |
+
color="white", fontsize=12)
|
| 239 |
+
plt.tight_layout()
|
| 240 |
+
plt.savefig("output/algo_comparison.png", dpi=150,
|
| 241 |
+
bbox_inches="tight", facecolor="#0d1117")
|
| 242 |
+
plt.close()
|
| 243 |
+
|
| 244 |
+
|
| 245 |
+
def plot_green_vs_fast():
|
| 246 |
+
"""Compare fastest route (A* on km) vs greenest route (A* on CO2)."""
|
| 247 |
+
pairs = [("U1", "R9"), ("U7", "R6"), ("R1", "U10")]
|
| 248 |
+
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
|
| 249 |
+
fig.patch.set_facecolor("#0d1117")
|
| 250 |
+
|
| 251 |
+
G = nx.Graph()
|
| 252 |
+
for n, (x, y, _) in NODES.items():
|
| 253 |
+
G.add_node(n, pos=(x, y))
|
| 254 |
+
for a, b, w in EDGES:
|
| 255 |
+
G.add_edge(a, b)
|
| 256 |
+
pos = {n: (NODES[n][0], NODES[n][1]) for n in NODES}
|
| 257 |
+
|
| 258 |
+
for ax, (s, g) in zip(axes, pairs):
|
| 259 |
+
ax.set_facecolor("#0d1117")
|
| 260 |
+
fast_path, fast_km, _ = astar(s, g, ADJ_KM)
|
| 261 |
+
green_path, green_co2, _ = astar(s, g, ADJ_CO2, h_scale=0.10)
|
| 262 |
+
|
| 263 |
+
# Compute cross-metrics
|
| 264 |
+
fast_co2 = sum(_edge_w(fast_path[i], fast_path[i+1], ADJ_CO2) for i in range(len(fast_path)-1))
|
| 265 |
+
green_km = sum(_edge_w(green_path[i], green_path[i+1], ADJ_KM) for i in range(len(green_path)-1))
|
| 266 |
+
|
| 267 |
+
colors = ["#ef4444" if NODES[n][2] == "urban" else "#10b981" for n in NODES]
|
| 268 |
+
nx.draw(G, pos, ax=ax, with_labels=True, node_color=colors,
|
| 269 |
+
node_size=300, font_size=7, font_weight="bold",
|
| 270 |
+
font_color="white", edge_color="#1e293b", width=0.8)
|
| 271 |
+
|
| 272 |
+
fast_edges = [(fast_path[i], fast_path[i+1]) for i in range(len(fast_path)-1)]
|
| 273 |
+
green_edges = [(green_path[i], green_path[i+1]) for i in range(len(green_path)-1)]
|
| 274 |
+
nx.draw_networkx_edges(G, pos, ax=ax, edgelist=fast_edges,
|
| 275 |
+
edge_color="#f59e0b", width=3, alpha=0.8)
|
| 276 |
+
nx.draw_networkx_edges(G, pos, ax=ax, edgelist=green_edges,
|
| 277 |
+
edge_color="#22c55e", width=3, style="dashed", alpha=0.8)
|
| 278 |
+
ax.set_title(f"{s} → {g}\nFast: {fast_km:.1f}km / {fast_co2:.2f}kg CO₂\n"
|
| 279 |
+
f"Green: {green_km:.1f}km / {green_co2:.2f}kg CO₂",
|
| 280 |
+
color="white", fontsize=9, linespacing=1.4)
|
| 281 |
+
fast_patch = mpatches.Patch(color="#f59e0b", label="Fastest (min km)")
|
| 282 |
+
green_patch = mpatches.Patch(color="#22c55e", label="Greenest (min CO₂)")
|
| 283 |
+
fig.legend(handles=[fast_patch, green_patch], loc="lower center",
|
| 284 |
+
ncol=2, fontsize=10, facecolor="#0d1117", edgecolor="#334155",
|
| 285 |
+
labelcolor="white")
|
| 286 |
+
plt.suptitle("Fast Route vs Green Route — same A*, different cost function",
|
| 287 |
+
color="white", fontsize=12)
|
| 288 |
+
plt.tight_layout(rect=[0, 0.06, 1, 0.95])
|
| 289 |
+
plt.savefig("output/green_vs_fast.png", dpi=150,
|
| 290 |
+
bbox_inches="tight", facecolor="#0d1117")
|
| 291 |
+
plt.close()
|
| 292 |
+
|
| 293 |
+
|
| 294 |
+
# ── 5. Main ─────────────────────────────────────────────────
|
| 295 |
+
def main():
|
| 296 |
+
print("="*70)
|
| 297 |
+
print("EcoCart Route Optimisation — A* vs IDA* benchmark")
|
| 298 |
+
print("="*70)
|
| 299 |
+
|
| 300 |
+
# Smoke test all four
|
| 301 |
+
for name, fn in [("BFS", bfs), ("DFS", dfs), ("A*", astar), ("IDA*", ida_star)]:
|
| 302 |
+
path, cost, exp = fn("U1", "U10")
|
| 303 |
+
print(f" {name:5s} U1->U10 cost={cost:.2f} km expanded={exp}")
|
| 304 |
+
|
| 305 |
+
# Full benchmark A* vs IDA*
|
| 306 |
+
results = {"urban": [], "rural": []}
|
| 307 |
+
for label, pairs in [("urban", OD_URBAN), ("rural", OD_RURAL)]:
|
| 308 |
+
print(f"\n--- {label.upper()} benchmark ---")
|
| 309 |
+
for s, g in pairs:
|
| 310 |
+
a = benchmark(astar, s, g)
|
| 311 |
+
i = benchmark(ida_star, s, g)
|
| 312 |
+
results[label].append({"pair": (s, g), "astar": a, "ida": i})
|
| 313 |
+
print(f" {s}->{g}: A* {a['cost']:.2f}km/{a['expanded']}exp/{a['ms']:.3f}ms "
|
| 314 |
+
f"IDA* {i['cost']:.2f}km/{i['expanded']}exp/{i['ms']:.3f}ms")
|
| 315 |
+
assert abs(a["cost"] - i["cost"]) < 1e-4, "Optimality mismatch"
|
| 316 |
+
|
| 317 |
+
# Green routing demo
|
| 318 |
+
print("\n--- GREEN ROUTING ---")
|
| 319 |
+
for s, g in [("U1","R9"), ("U7","R6")]:
|
| 320 |
+
fp, fk, _ = astar(s, g, ADJ_KM)
|
| 321 |
+
gp, gc, _ = astar(s, g, ADJ_CO2, h_scale=0.10)
|
| 322 |
+
fco2 = sum(_edge_w(fp[i], fp[i+1], ADJ_CO2) for i in range(len(fp)-1))
|
| 323 |
+
gkm = sum(_edge_w(gp[i], gp[i+1], ADJ_KM) for i in range(len(gp)-1))
|
| 324 |
+
print(f" {s}->{g} Fast: {fk:.1f}km/{fco2:.2f}kgCO2 Green: {gkm:.1f}km/{gc:.2f}kgCO2")
|
| 325 |
+
|
| 326 |
+
# Generate plots
|
| 327 |
+
plot_network()
|
| 328 |
+
plot_comparison(results)
|
| 329 |
+
plot_green_vs_fast()
|
| 330 |
+
print("\nWrote: network_map.png, algo_comparison.png, green_vs_fast.png")
|
| 331 |
+
|
| 332 |
+
if __name__ == "__main__":
|
| 333 |
+
main()
|
task5_forecasting.py
ADDED
|
@@ -0,0 +1,137 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
EcoCart Demand Forecasting Prototype
|
| 3 |
+
Task 5 — Linear Regression vs Random Forest on synthetic daily sales.
|
| 4 |
+
|
| 5 |
+
NCI MSCAI | Fundamentals of AI TABA 2026
|
| 6 |
+
|
| 7 |
+
Run: python3 task5_forecasting.py
|
| 8 |
+
Out: forecast.png, residuals.png, feature_importance.png
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
import numpy as np
|
| 12 |
+
import pandas as pd
|
| 13 |
+
import matplotlib.pyplot as plt
|
| 14 |
+
from sklearn.linear_model import LinearRegression
|
| 15 |
+
from sklearn.ensemble import RandomForestRegressor
|
| 16 |
+
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
|
| 17 |
+
|
| 18 |
+
RNG = np.random.default_rng(42)
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
# ── 1. Synthetic sales data ────────────────────────────────
|
| 22 |
+
def generate_sales(days=730):
|
| 23 |
+
t = np.arange(days)
|
| 24 |
+
dates = pd.date_range("2023-01-01", periods=days, freq="D")
|
| 25 |
+
base = 100 + 0.05 * t
|
| 26 |
+
weekly = 25 * np.sin(2 * np.pi * t / 7)
|
| 27 |
+
yearly = 40 * np.sin(2 * np.pi * t / 365)
|
| 28 |
+
noise = RNG.normal(0, 8, days)
|
| 29 |
+
promo = np.zeros(days)
|
| 30 |
+
promo[RNG.choice(days, int(days * 0.06), replace=False)] = RNG.uniform(30, 70, int(days * 0.06))
|
| 31 |
+
sales = np.clip(base + weekly + yearly + noise + promo, 0, None)
|
| 32 |
+
|
| 33 |
+
return pd.DataFrame({
|
| 34 |
+
"date": dates, "sales": sales,
|
| 35 |
+
"dow": dates.dayofweek, "month": dates.month,
|
| 36 |
+
"day_of_year": dates.dayofyear,
|
| 37 |
+
"is_promo": (promo > 0).astype(int),
|
| 38 |
+
})
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
# ── 2. Features ────────────────────────────────────────────
|
| 42 |
+
def add_features(df):
|
| 43 |
+
out = df.copy()
|
| 44 |
+
for lag in [1, 7, 14]:
|
| 45 |
+
out[f"lag_{lag}"] = out["sales"].shift(lag)
|
| 46 |
+
out["roll_7"] = out["sales"].shift(1).rolling(7).mean()
|
| 47 |
+
out["roll_30"] = out["sales"].shift(1).rolling(30).mean()
|
| 48 |
+
return out.dropna().reset_index(drop=True)
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
FEATURES = ["dow", "month", "day_of_year", "is_promo",
|
| 52 |
+
"lag_1", "lag_7", "lag_14", "roll_7", "roll_30"]
|
| 53 |
+
|
| 54 |
+
|
| 55 |
+
# ── 3. Train & evaluate ───────────────────────────────────
|
| 56 |
+
def evaluate(name, y_true, y_pred):
|
| 57 |
+
mae = mean_absolute_error(y_true, y_pred)
|
| 58 |
+
rmse = mean_squared_error(y_true, y_pred) ** 0.5
|
| 59 |
+
r2 = r2_score(y_true, y_pred)
|
| 60 |
+
mape = np.mean(np.abs((y_true - y_pred) / np.where(y_true == 0, 1, y_true))) * 100
|
| 61 |
+
print(f" {name:<22s} MAE={mae:6.2f} RMSE={rmse:6.2f} R²={r2:.3f} MAPE={mape:.2f}%")
|
| 62 |
+
return {"mae": mae, "rmse": rmse, "r2": r2, "mape": mape}
|
| 63 |
+
|
| 64 |
+
|
| 65 |
+
def main():
|
| 66 |
+
print("="*70)
|
| 67 |
+
print("EcoCart Demand Forecasting — LR vs Random Forest")
|
| 68 |
+
print("="*70)
|
| 69 |
+
|
| 70 |
+
df = generate_sales()
|
| 71 |
+
df = add_features(df)
|
| 72 |
+
split = int(len(df) * 0.8)
|
| 73 |
+
train, test = df.iloc[:split], df.iloc[split:]
|
| 74 |
+
X_tr, y_tr = train[FEATURES], train["sales"]
|
| 75 |
+
X_te, y_te = test[FEATURES], test["sales"]
|
| 76 |
+
print(f"Train: {len(train)} days Test: {len(test)} days")
|
| 77 |
+
|
| 78 |
+
lr = LinearRegression().fit(X_tr, y_tr)
|
| 79 |
+
rf = RandomForestRegressor(n_estimators=200, max_depth=12,
|
| 80 |
+
min_samples_leaf=3, random_state=42,
|
| 81 |
+
n_jobs=-1).fit(X_tr, y_tr)
|
| 82 |
+
lr_pred = lr.predict(X_te)
|
| 83 |
+
rf_pred = rf.predict(X_te)
|
| 84 |
+
|
| 85 |
+
print("\nTest-set metrics:")
|
| 86 |
+
lr_m = evaluate("Linear Regression", y_te.values, lr_pred)
|
| 87 |
+
rf_m = evaluate("Random Forest", y_te.values, rf_pred)
|
| 88 |
+
|
| 89 |
+
# ── Plots ──
|
| 90 |
+
plt.rcParams.update({"axes.facecolor":"#0d1117","figure.facecolor":"#0d1117",
|
| 91 |
+
"text.color":"white","axes.labelcolor":"white",
|
| 92 |
+
"xtick.color":"white","ytick.color":"white"})
|
| 93 |
+
|
| 94 |
+
# Forecast
|
| 95 |
+
fig, ax = plt.subplots(figsize=(13, 5))
|
| 96 |
+
ax.plot(test.date, y_te, color="#e2e8f0", lw=1.3, label="Actual")
|
| 97 |
+
ax.plot(test.date, lr_pred, color="#3b82f6", lw=1, alpha=0.8, label="Linear Regression")
|
| 98 |
+
ax.plot(test.date, rf_pred, color="#10b981", lw=1, alpha=0.8, label="Random Forest")
|
| 99 |
+
ax.set_title("Test-set: actual vs predicted daily demand", fontsize=12)
|
| 100 |
+
ax.set_xlabel("Date"); ax.set_ylabel("Units sold")
|
| 101 |
+
ax.legend(fontsize=9, facecolor="#0d1117", edgecolor="#334155", labelcolor="white")
|
| 102 |
+
ax.grid(True, alpha=0.1)
|
| 103 |
+
plt.tight_layout()
|
| 104 |
+
plt.savefig("output/forecast.png", dpi=150, bbox_inches="tight")
|
| 105 |
+
plt.close()
|
| 106 |
+
|
| 107 |
+
# Residuals
|
| 108 |
+
fig, axes = plt.subplots(1, 2, figsize=(13, 4.5))
|
| 109 |
+
for ax, pred, name, color, m in [
|
| 110 |
+
(axes[0], lr_pred, "Linear Regression", "#3b82f6", lr_m),
|
| 111 |
+
(axes[1], rf_pred, "Random Forest", "#10b981", rf_m),
|
| 112 |
+
]:
|
| 113 |
+
ax.scatter(pred, y_te.values - pred, s=12, c=color, alpha=0.6)
|
| 114 |
+
ax.axhline(0, color="white", lw=0.8)
|
| 115 |
+
ax.set_title(f"{name} residuals (RMSE={m['rmse']:.2f})", fontsize=11)
|
| 116 |
+
ax.set_xlabel("Predicted"); ax.set_ylabel("Residual")
|
| 117 |
+
ax.grid(True, alpha=0.1)
|
| 118 |
+
plt.tight_layout()
|
| 119 |
+
plt.savefig("output/residuals.png", dpi=150, bbox_inches="tight")
|
| 120 |
+
plt.close()
|
| 121 |
+
|
| 122 |
+
# Feature importance
|
| 123 |
+
imp = pd.Series(rf.feature_importances_, index=FEATURES).sort_values()
|
| 124 |
+
fig, ax = plt.subplots(figsize=(8, 4.5))
|
| 125 |
+
ax.barh(imp.index, imp.values, color="#10b981")
|
| 126 |
+
ax.set_title("Random Forest — feature importance", fontsize=12)
|
| 127 |
+
ax.set_xlabel("Importance")
|
| 128 |
+
ax.grid(True, axis="x", alpha=0.1)
|
| 129 |
+
plt.tight_layout()
|
| 130 |
+
plt.savefig("output/feature_importance.png", dpi=150, bbox_inches="tight")
|
| 131 |
+
plt.close()
|
| 132 |
+
|
| 133 |
+
print(f"\nTop features: {', '.join(imp.index[-3:][::-1])}")
|
| 134 |
+
print("Wrote: forecast.png, residuals.png, feature_importance.png")
|
| 135 |
+
|
| 136 |
+
if __name__ == "__main__":
|
| 137 |
+
main()
|