Spaces:

Esvanth
/

EcoCartAI

Sleeping

App Files Files Community

Esvanth commited on 9 days ago

Commit

0ed43fe

0 Parent(s):

Initial commit

Browse files

Files changed (7) hide show

.gitignore +9 -0
README.md +93 -0
app.py +860 -0
requirements.txt +5 -0
task2_segmentation.py +240 -0
task3_4_routing.py +333 -0
task5_forecasting.py +137 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,9 @@

+output/
+__pycache__/
+*.pyc
+.env
+*.png
+*.docx
+*.pdf
+.claude/
+EcoCart_Report.docx

README.md ADDED Viewed

	@@ -0,0 +1,93 @@

+# EcoCart AI System
+An interactive AI-powered logistics simulation
+🚀 **Live Demo:** [Launch on Streamlit](https://esvanth-ecocart-ai.streamlit.app)
+---
+## What is EcoCart?
+EcoCart is a mid-sized e-commerce company facing challenges in optimising its logistics network. This project proposes an AI-based solution across five tasks — from intelligent delivery agents to demand forecasting.
+---
+## Tasks Covered
+### Task 1 — AI Agents
+Demonstrates three types of AI agents navigating a delivery map in real time:
+- **Reactive Agent** — goes to the nearest stop, no planning
+- **Goal-Based Agent** — plans the full route before departing (2-opt optimised)
+- **Utility-Based Agent** — balances urgency vs distance to prioritise high-value stops
+### Task 2 — Bias Detection & Mitigation
+Uses K-Means clustering to segment customers into value tiers. Detects urban/rural bias using **Disparate Impact (DI)** analysis and applies a three-step mitigation strategy:
+- Oversample rural customers to balance the dataset
+- Adjust spend for delivery cost premium (+€12)
+- Adjust frequency for rural order batching (×1.5)
+### Task 3 — Search Algorithms for Route Optimisation
+Implements all four search algorithms on a 20-node urban/rural delivery network:
+- **BFS** — Breadth-First Search
+- **DFS** — Depth-First Search
+- **A\*** — Best-first with Euclidean heuristic
+- **IDA\*** — Iterative Deepening A*
+Includes a live **exploration replay slider** — drag to watch the algorithm search node by node.
+### Task 4 — A* vs IDA* Comparative Analysis
+Benchmarks both algorithms on 10 origin-destination pairs (5 urban, 5 rural) over multiple timing runs. Compares nodes expanded, average time, and memory behaviour.
+### Task 5 — Demand Forecasting
+Trains two ML models on 730 days of synthetic sales data:
+- **Linear Regression** — fast and interpretable
+- **Random Forest** — captures non-linear seasonal patterns
+Features a **what-if predictor** — enter any day, month, and promotion flag to get an instant sales prediction.
+---
+## Tech Stack
+| Tool | Purpose |
+|------|---------|
+| Python 3.11 | Core language |
+| Streamlit | Interactive web app |
+| Plotly | Interactive charts |
+| scikit-learn | K-Means, LR, Random Forest |
+| NumPy / Pandas | Data processing |
+---
+## Run Locally
+```bash
+git clone https://github.com/Esvanth/Ecocart-AI.git
+cd Ecocart-AI
+pip install -r requirements.txt
+streamlit run app.py
+```
+---
+## Project Structure
+```
+Ecocart-AI/
+├── app.py                   # Main Streamlit app (all 5 tasks)
+├── task2_segmentation.py    # Standalone Task 2 script
+├── task3_4_routing.py       # Standalone Tasks 3 & 4 script
+├── task5_forecasting.py     # Standalone Task 5 script
+├── requirements.txt         # Python dependencies
+└── README.md
+```
+---
+## Author
+**Esvanth Mohankumar**
+Student ID: 24311073
+Programme: MSc Artificial Intelligence
+Institution: National College of Ireland
+Module: Foundations  of AI

app.py ADDED Viewed

	@@ -0,0 +1,860 @@

+"""
+EcoCart AI System  —  TABA Section II
+NCI MSCAI 2026
+"""
+import math, heapq, time
+from collections import deque
+import numpy as np
+import pandas as pd
+import plotly.graph_objects as go
+from plotly.subplots import make_subplots
+import streamlit as st
+from sklearn.cluster import KMeans
+from sklearn.preprocessing import StandardScaler
+from sklearn.linear_model import LinearRegression
+from sklearn.ensemble import RandomForestRegressor
+from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
+# ── page ──────────────────────────────────────────────────────────────────────
+st.set_page_config(page_title="EcoCart AI", layout="wide",
+                   initial_sidebar_state="collapsed")
+st.markdown("""
+<style>
+  [data-testid="stAppViewContainer"] { background:#f0f4f8; }
+  [data-testid="stHeader"]           { background:transparent; }
+  .block-container { padding:1rem 2rem 3rem; }
+  .stTabs [data-baseweb="tab-list"]  { background:#fff; border-radius:12px;
+                                        padding:4px; box-shadow:0 1px 4px rgba(0,0,0,.08); }
+  .stTabs [data-baseweb="tab"]       { font-size:.88rem; font-weight:600;
+                                        border-radius:8px; padding:8px 20px; }
+  div[data-testid="metric-container"]{ background:#fff; border-radius:10px;
+                                        padding:14px 18px;
+                                        box-shadow:0 1px 4px rgba(0,0,0,.07); }
+  .card  { background:#fff; border-radius:14px; padding:20px 24px;
+            box-shadow:0 1px 5px rgba(0,0,0,.08); margin-bottom:14px; }
+  .badge-green { display:inline-block; background:#d1fae5; color:#065f46;
+                 border-radius:99px; padding:3px 12px; font-size:.78rem;
+                 font-weight:700; }
+  .badge-red   { display:inline-block; background:#fee2e2; color:#991b1b;
+                 border-radius:99px; padding:3px 12px; font-size:.78rem;
+                 font-weight:700; }
+  .badge-blue  { display:inline-block; background:#dbeafe; color:#1e40af;
+                 border-radius:99px; padding:3px 12px; font-size:.78rem;
+                 font-weight:700; }
+  .tip { background:#f8fafc; border:1px solid #e2e8f0; border-radius:8px;
+         padding:10px 14px; font-size:.82rem; color:#475569; margin:8px 0; }
+  .section-label { font-size:.72rem; font-weight:700; letter-spacing:.08em;
+                   color:#94a3b8; text-transform:uppercase; margin-bottom:4px; }
+</style>
+""", unsafe_allow_html=True)
+# ── colours ───────────────────────────────────────────────────────────────────
+BG,SURF,LINE = "#f0f4f8","#ffffff","#e2e8f0"
+FG,MUTE      = "#1e293b","#64748b"
+GREEN,BLUE,RED,AMBER,PURPLE = "#10b981","#3b82f6","#ef4444","#f59e0b","#8b5cf6"
+SEG_COL={"High Value":GREEN,"Medium":AMBER,"Low Value":RED,"Group 4":PURPLE}
+def _ch(h=380,title=""):
+    return dict(height=h,paper_bgcolor=SURF,plot_bgcolor=BG,
+                font=dict(color=FG,size=11),
+                title=dict(text=title,font=dict(size=13,color=FG),x=0),
+                margin=dict(l=50,r=20,t=48,b=40),
+                legend=dict(bgcolor=SURF,bordercolor=LINE,borderwidth=1))
+def _xax(**k): return dict(gridcolor=LINE,zeroline=False,linecolor=LINE,**k)
+def _yax(**k): return dict(gridcolor=LINE,zeroline=False,linecolor=LINE,**k)
+# ══════════════════════════════════════════════════════════════════════════════
+#  NETWORK DATA
+# ══════════════════════════════════════════════════════════════════════════════
+NODES={
+    "U1":(1.0,1.0,"urban"), "U2":(2.0,1.5,"urban"), "U3":(3.0,1.0,"urban"),
+    "U4":(1.5,2.5,"urban"), "U5":(2.5,3.0,"urban"), "U6":(3.5,2.0,"urban"),
+    "U7":(1.0,3.5,"urban"), "U8":(2.0,4.0,"urban"), "U9":(3.0,4.0,"urban"),
+    "U10":(4.0,3.5,"urban"),
+    "R1":(6.0,1.0,"rural"), "R2":(8.0,2.0,"rural"), "R3":(10.0,1.5,"rural"),
+    "R4":(7.0,4.0,"rural"), "R5":(9.0,4.5,"rural"), "R6":(11.0,3.5,"rural"),
+    "R7":(6.5,6.0,"rural"), "R8":(9.0,7.0,"rural"), "R9":(11.0,6.0,"rural"),
+    "R10":(8.0,5.5,"rural"),
+}
+_EP=[("U1","U2"),("U2","U3"),("U1","U4"),("U2","U4"),("U2","U5"),
+     ("U3","U6"),("U4","U5"),("U5","U6"),("U4","U7"),("U5","U8"),
+     ("U6","U10"),("U7","U8"),("U8","U9"),("U9","U10"),("U5","U9"),
+     ("R1","R2"),("R2","R3"),("R1","R4"),("R2","R4"),("R3","R6"),
+     ("R4","R5"),("R5","R6"),("R4","R7"),("R5","R10"),("R7","R10"),
+     ("R7","R8"),("R8","R9"),("R6","R9"),("R8","R10"),("R5","R8"),
+     ("U3","R1"),("U10","R4"),("U6","R1"),("U9","R7")]
+def _nd(a,b): return math.hypot(NODES[a][0]-NODES[b][0],NODES[a][1]-NODES[b][1])
+def _cr(a,b):
+    za,zb=NODES[a][2],NODES[b][2]
+    return 0.28 if za==zb=="urban" else 0.18 if za!=zb else 0.10
+EDGES    =[(a,b,round(_nd(a,b)*1.15,2)) for a,b in _EP]
+CO2_EDGES=[(a,b,round(_nd(a,b)*1.15*_cr(a,b),3)) for a,b in _EP]
+ADJ_KM={n:[] for n in NODES}; ADJ_CO2={n:[] for n in NODES}
+for i,(a,b,w) in enumerate(EDGES):
+    ADJ_KM[a].append((b,w)); ADJ_KM[b].append((a,w))
+    c=CO2_EDGES[i][2]; ADJ_CO2[a].append((b,c)); ADJ_CO2[b].append((a,c))
+def _ew(a,b,adj):
+    for nb,w in adj[a]:
+        if nb==b: return w
+    return math.inf
+# ── algorithms (return path, cost, exploration_order) ─────────────────────────
+def bfs(s,g,adj):
+    q=deque([(s,[s])]); seen={s}; expl=[]
+    while q:
+        n,p=q.popleft(); expl.append(n)
+        if n==g:
+            return p,round(sum(_ew(p[i],p[i+1],adj) for i in range(len(p)-1)),2),expl
+        for nb,_ in adj[n]:
+            if nb not in seen: seen.add(nb); q.append((nb,p+[nb]))
+    return None,0.0,expl
+def dfs(s,g,adj):
+    stack=[(s,[s])]; seen={s}; expl=[]
+    while stack:
+        n,p=stack.pop(); expl.append(n)
+        if n==g:
+            return p,round(sum(_ew(p[i],p[i+1],adj) for i in range(len(p)-1)),2),expl
+        if len(p)>=50: continue
+        for nb,_ in adj[n]:
+            if nb not in seen: seen.add(nb); stack.append((nb,p+[nb]))
+    return None,0.0,expl
+def astar(s,g,adj):
+    ctr=0; h=lambda n:_nd(n,g); expl=[]
+    heap=[(h(s),0.0,ctr,s,[s])]; best={s:0.0}
+    while heap:
+        _,gc,_,n,p=heapq.heappop(heap)
+        if n==g: return p,round(gc,2),expl
+        if gc>best.get(n,math.inf): continue
+        expl.append(n)
+        for nb,w in adj[n]:
+            ng=gc+w
+            if ng<best.get(nb,math.inf):
+                best[nb]=ng; ctr+=1
+                heapq.heappush(heap,(ng+h(nb),ng,ctr,nb,p+[nb]))
+    return None,0.0,expl
+def ida_star(s,g,adj):
+    expl=[]; h=lambda n:_nd(n,g)
+    def _dfs(n,gc,bound,path,vis):
+        f=gc+h(n)
+        if f>bound: return None,f
+        expl.append(n)
+        if n==g: return list(path),gc
+        nxt=math.inf
+        for nb,w in adj[n]:
+            if nb in vis: continue
+            vis.add(nb); path.append(nb)
+            r,t=_dfs(nb,gc+w,bound,path,vis)
+            if r is not None: return r,t
+            if t<nxt: nxt=t
+            path.pop(); vis.remove(nb)
+        return None,nxt
+    bound=h(s)
+    while True:
+        r,t=_dfs(s,0.0,bound,[s],{s})
+        if r is not None: return r,round(t,2),expl
+        if t==math.inf: return None,0.0,expl
+        bound=t
+ALGOS={"BFS":bfs,"DFS":dfs,"A*":astar,"IDA*":ida_star}
+# ── network figure builder ────────────────────────────────────────────────────
+def build_network(sn,en,path,explored_so_far,adj,unit,algo_name):
+    pc=GREEN if unit=="CO2" else AMBER
+    path_set=set(path) if path else set()
+    fig=go.Figure()
+    # edges
+    for a,b,w in EDGES:
+        on_path=(a in path_set and b in path_set and
+                 any((path[i]==a and path[i+1]==b) or
+                     (path[i]==b and path[i+1]==a)
+                     for i in range(len(path)-1)) if path else False)
+        lc=pc if on_path else "#dde3ed"
+        lw=5  if on_path else 1.5
+        co2w=_ew(a,b,ADJ_CO2)
+        fig.add_trace(go.Scatter(
+            x=[NODES[a][0],NODES[b][0],None],y=[NODES[a][1],NODES[b][1],None],
+            mode="lines",line=dict(color=lc,width=lw),
+            showlegend=False,hoverinfo="skip"))
+    # nodes
+    for zone,bc in [("urban","#ef4444"),("rural",GREEN)]:
+        ns=[(n,d) for n,d in NODES.items() if d[2]==zone]
+        cols,sizes=[],[]
+        for n,_ in ns:
+            if n==sn:          cols.append("#fff");   sizes.append(28)
+            elif n==en:        cols.append("#facc15"); sizes.append(28)
+            elif n in path_set:cols.append(pc);        sizes.append(22)
+            elif n in explored_so_far: cols.append("#bfdbfe"); sizes.append(18)
+            else:              cols.append(bc);        sizes.append(18)
+        fig.add_trace(go.Scatter(
+            x=[d[0] for _,d in ns],y=[d[1] for _,d in ns],
+            mode="markers+text",name=zone.title(),
+            marker=dict(size=sizes,color=cols,line=dict(color=FG,width=1.5)),
+            text=[n for n,_ in ns],textposition="middle center",
+            textfont=dict(size=8,color=FG,family="monospace"),
+            hovertemplate="<b>%{text}</b><br>"+zone+"<extra></extra>"))
+    title=(f"{algo_name}: {sn} → {en} | "
+           f"{'Explored '+str(len(explored_so_far))+' nodes' if explored_so_far else 'Ready'}")
+    fig.update_layout(**_ch(480,title))
+    fig.update_layout(legend=dict(bgcolor=SURF,bordercolor=LINE,x=0.01,y=0.99))
+    fig.update_xaxes(showgrid=False,showticklabels=False,zeroline=False)
+    fig.update_yaxes(showgrid=False,showticklabels=False,zeroline=False)
+    return fig
+# ══════════════════════════════════════════════════════════════════════════════
+#  AGENT SIMULATION
+# ══════════════════════════════════════════════════════════════════════════════
+STOPS={
+    "Depot": (0.0,0.0,0), "Shop A":(2.0,3.0,3), "Shop B":(5.0,1.0,4),
+    "Shop C":(7.0,4.0,2), "Shop D":(3.0,6.0,5), "Shop E":(8.0,7.0,1),
+    "Shop F":(1.0,8.0,3), "Shop G":(6.0,9.0,4), "Shop H":(9.0,2.0,2),
+}
+def _sd(a,b): ax,ay,_=STOPS[a]; bx,by,_=STOPS[b]; return math.hypot(ax-bx,ay-by)
+def _reactive():
+    r=["Depot"]; u=[k for k in STOPS if k!="Depot"]; cur="Depot"
+    while u: nb=min(u,key=lambda n:_sd(cur,n)); r.append(nb); u.remove(nb); cur=nb
+    return r+["Depot"]
+def _goal():
+    r=_reactive()[:-1]
+    td=lambda x:sum(_sd(x[i],x[i+1]) for i in range(len(x)-1))+_sd(x[-1],x[0])
+    ok=True
+    while ok:
+        ok=False
+        for i in range(1,len(r)-1):
+            for j in range(i+1,len(r)):
+                nr=r[:i]+r[i:j+1][::-1]+r[j+1:]
+                if td(nr)<td(r)-1e-9: r=nr; ok=True
+    return r+["Depot"]
+def _utility():
+    r=["Depot"]; u=[k for k in STOPS if k!="Depot"]; cur="Depot"
+    while u:
+        nb=max(u,key=lambda n:STOPS[n][2]/(_sd(cur,n)+.1))
+        r.append(nb); u.remove(nb); cur=nb
+    return r+["Depot"]
+ROUTES={"Nearest stop":_reactive(),"Planned route":_goal(),"Priority first":_utility()}
+AGENT_COL={"Nearest stop":BLUE,"Planned route":GREEN,"Priority first":AMBER}
+AGENT_DESC={
+    "Nearest stop":  "Reactive agent — goes to the closest unvisited stop. Simple and fast, no planning.",
+    "Planned route": "Goal-based agent — computes the shortest full route before departing.",
+    "Priority first":"Utility-based agent — balances urgency vs distance. Starred stops are served first.",
+}
+def _route_km(r): return round(sum(_sd(r[i],r[i+1]) for i in range(len(r)-1)),2)
+def draw_agent(route,step,ac):
+    visited=set(route[:step+1]); pso=route[:step+1]
+    km=sum(_sd(pso[i],pso[i+1]) for i in range(len(pso)-1))
+    cur=route[step]
+    fig=go.Figure()
+    for na in STOPS:
+        for nb in STOPS:
+            if na>=nb: continue
+            x1,y1,_=STOPS[na]; x2,y2,_=STOPS[nb]
+            if math.hypot(x1-x2,y1-y2)<5.5:
+                fig.add_trace(go.Scatter(x=[x1,x2,None],y=[y1,y2,None],mode="lines",
+                    line=dict(color="#e2e8f0",width=1),showlegend=False,hoverinfo="skip"))
+    if len(pso)>1:
+        fig.add_trace(go.Scatter(
+            x=[STOPS[n][0] for n in pso],y=[STOPS[n][1] for n in pso],
+            mode="lines+markers",line=dict(color=ac,width=3),
+            marker=dict(size=6,color=ac),showlegend=False,hoverinfo="skip"))
+    for name,(nx,ny,pri) in STOPS.items():
+        if name=="Depot":       nc,sz,sym="#3b82f6",26,"square"
+        elif name==cur:         nc,sz,sym=ac,28,"circle"
+        elif name in visited:   nc,sz,sym=GREEN,18,"circle"
+        else:                   nc,sz,sym="#cbd5e1",18,"circle"
+        label=("⭐" if pri>=4 else "")+" "+name.replace("Shop ","")
+        fig.add_trace(go.Scatter(x=[nx],y=[ny],mode="markers+text",showlegend=False,
+            marker=dict(size=sz,color=nc,line=dict(color="#fff",width=2)),
+            text=[label.strip()],textposition="top center",textfont=dict(size=9,color=FG),
+            hovertemplate=f"<b>{name}</b><br>Priority {pri}/5<br>{'✓ Visited' if name in visited else 'Pending'}<extra></extra>"))
+    fig.update_layout(**_ch(400,f"Step {step}/{len(route)-1}  —  {km:.1f} km so far"))
+    fig.update_xaxes(showgrid=False,showticklabels=False,zeroline=False,range=[-0.5,10.5])
+    fig.update_yaxes(showgrid=False,showticklabels=False,zeroline=False,range=[-0.5,10.5])
+    return fig, round(km,2)
+# ══════════════════════════════════════════════════════════════════════════════
+#  SEGMENTATION
+# ══════════════════════════════════════════════════════════════════════════════
+@st.cache_data
+def _customers(nu,nr):
+    rng=np.random.default_rng(42)
+    u=pd.DataFrame({"freq":rng.normal(6,2,nu).clip(.5),"spend":rng.normal(120,40,nu).clip(10),
+                    "recency":rng.exponential(10,nu).clip(1,90),"region":"urban"})
+    r=pd.DataFrame({"freq":rng.normal(3,1.5,nr).clip(.5),"spend":rng.normal(65,30,nr).clip(10),
+                    "recency":rng.exponential(15,nr).clip(1,90),"region":"rural"})
+    return pd.concat([u,r],ignore_index=True).round(1)
+def _kmeans(df,k):
+    X=StandardScaler().fit_transform(df[["freq","spend","recency"]])
+    df=df.copy(); df["cluster"]=KMeans(n_clusters=k,random_state=42,n_init=10).fit_predict(X)
+    order=df.groupby("cluster")["spend"].mean().sort_values(ascending=False).index
+    names=(["High Value","Medium","Low Value","Group 4"])[:k]
+    df["segment"]=df["cluster"].map({order[i]:names[i] for i in range(k)})
+    return df
+def _di(df):
+    u=(df[df.region=="urban"].segment=="High Value").mean()
+    r=(df[df.region=="rural"].segment=="High Value").mean()
+    return round(u*100,1),round(r*100,1),round(r/u if u else 0,3)
+@st.cache_data
+def _fix(nu,nr,k):
+    df=_customers(nu,nr)
+    bal=pd.concat([df[df.region=="urban"],
+                   df[df.region=="rural"].sample(len(df[df.region=="urban"]),replace=True,random_state=42)],
+                  ignore_index=True).copy()
+    bal.loc[bal.region=="rural","spend"]+=12
+    bal.loc[bal.region=="rural","freq"]*=1.5
+    bal=_kmeans(bal,k)
+    rm=bal.region=="rural"; um=bal.region=="urban"
+    need=int((bal[um].segment=="High Value").mean()*.85*rm.sum())-(bal[rm].segment=="High Value").sum()
+    if need>0:
+        cands=bal[rm&(bal.segment!="High Value")]
+        bal.loc[cands.nlargest(min(need,len(cands)),"spend").index,"segment"]="High Value"
+    return bal
+# ══════════════════════════════════════════════════════════════════════════════
+#  FORECASTING
+# ══════════════════════════════════════════════════════════════════════════════
+@st.cache_data
+def _sales():
+    rng=np.random.default_rng(42); days=730
+    t=np.arange(days); dates=pd.date_range("2023-01-01",periods=days,freq="D")
+    promo=np.zeros(days); promo[rng.choice(days,int(days*.06),replace=False)]=rng.uniform(30,70,int(days*.06))
+    sales=np.clip(100+.05*t+25*np.sin(2*np.pi*t/7)+40*np.sin(2*np.pi*t/365)+rng.normal(0,8,days)+promo,0,None)
+    df=pd.DataFrame({"date":dates,"sales":sales,"dow":dates.dayofweek,"month":dates.month,
+                     "day_of_year":dates.dayofyear,"is_promo":(promo>0).astype(int)})
+    for l in [1,7,14]: df[f"lag_{l}"]=df["sales"].shift(l)
+    df["roll_7"]=df["sales"].shift(1).rolling(7).mean()
+    df["roll_30"]=df["sales"].shift(1).rolling(30).mean()
+    return df.dropna().reset_index(drop=True)
+FEATS=["dow","month","day_of_year","is_promo","lag_1","lag_7","lag_14","roll_7","roll_30"]
+FEAT_LABELS={"lag_7":"Sales 7 days ago","lag_1":"Yesterday's sales","lag_14":"Sales 14 days ago",
+             "roll_7":"7-day average","roll_30":"30-day average","is_promo":"Promotion active",
+             "day_of_year":"Day of year","month":"Month","dow":"Day of week"}
+@st.cache_data
+def _train(tp,ne):
+    df=_sales(); sp=int(len(df)*tp/100); tr,te=df.iloc[:sp],df.iloc[sp:]
+    lr=LinearRegression().fit(tr[FEATS],tr["sales"])
+    rf=RandomForestRegressor(n_estimators=ne,max_depth=12,min_samples_leaf=3,
+                             random_state=42,n_jobs=-1).fit(tr[FEATS],tr["sales"])
+    lp=lr.predict(te[FEATS]); rp=rf.predict(te[FEATS])
+    return lr,rf,te,lp,rp,rf.feature_importances_
+def _met(y,yh):
+    return (round(mean_absolute_error(y,yh),1),
+            round(mean_squared_error(y,yh)**.5,1),
+            round(r2_score(y,yh),3),
+            round(np.mean(np.abs((y-yh)/np.where(y==0,1,y)))*100,1))
+# ══════════════════════════════════════════════════════════════════════════════
+#  HEADER
+# ══════════════════════════════════════════════════════════════════════════════
+st.markdown("<h2 style='margin:0 0 12px;color:#1e293b'>🛒 EcoCart AI System</h2>",
+            unsafe_allow_html=True)
+T1,T2,T3,T4,T5=st.tabs([
+    "🤖  Task 1 — AI Agents",
+    "⚖️  Task 2 — Bias Check",
+    "🗺️  Task 3 — Route Finder",
+    "📊  Task 4 — Speed Test",
+    "📈  Task 5 — Sales Forecast",
+])
+# ══════════════════════════════════════════════════════════════════════════════
+#  TASK 1
+# ══════════════════════════════════════════════════════════════════════════════
+with T1:
+    st.markdown("### Watch the AI delivery agent navigate in real time")
+    st.caption("Three different AI strategies — pick one and press Play to watch it move stop by stop.")
+    # ── agent picker ──────────────────────────────────────────────────────────
+    a_cols=st.columns(3)
+    agent_names=list(ROUTES.keys())
+    if "agent" not in st.session_state: st.session_state.agent="Nearest stop"
+    for i,(col,name) in enumerate(zip(a_cols,agent_names)):
+        km=_route_km(ROUTES[name])
+        active=st.session_state.agent==name
+        border=f"3px solid {AGENT_COL[name]}" if active else "2px solid #e2e8f0"
+        bg=f"{AGENT_COL[name]}12" if active else "#fff"
+        if col.button(f"{'✓ ' if active else ''}{name}  ({km} km)",
+                      key=f"ab_{name}",use_container_width=True):
+            st.session_state.agent=name
+            st.session_state.stp=0
+            st.session_state.playing=False
+    agent=st.session_state.agent
+    ac=AGENT_COL[agent]
+    route=ROUTES[agent]; mx=len(route)-1
+    # ── playback controls ─────────────────────────────────────────────────────
+    ctl=st.columns([1,1,1,1,3])
+    if ctl[0].button("⏮ Start"):
+        st.session_state.stp=0; st.session_state.playing=False
+    if ctl[1].button("◀ Back") and st.session_state.get("stp",0)>0:
+        st.session_state.stp-=1; st.session_state.playing=False
+    if ctl[2].button("▶ Next") and st.session_state.get("stp",0)<mx:
+        st.session_state.stp+=1; st.session_state.playing=False
+    playing=st.session_state.get("playing",False)
+    if ctl[3].button("⏸ Pause" if playing else "▶ Play"):
+        st.session_state.playing=not playing
+    speed=ctl[4].slider("Speed",1,8,3,label_visibility="collapsed",
+                         help="Animation speed (steps per second)")
+    stp=st.session_state.get("stp",0)
+    fig_agent,km_done=draw_agent(route,stp,ac)
+    # ── map + stats ───────────────────────────────────────────────────────────
+    map_c,stat_c=st.columns([3,1])
+    with map_c:
+        st.plotly_chart(fig_agent,use_container_width=True,key="agent_map")
+    with stat_c:
+        st.markdown(f"<div class='section-label'>Current status</div>",unsafe_allow_html=True)
+        st.metric("Stops completed",f"{stp} / {mx}")
+        st.metric("Distance covered",f"{km_done} km")
+        psum=sum(STOPS[n][2] for n in route[:stp+1] if n!="Depot")
+        st.metric("Priority points served",psum)
+        st.markdown(" ")
+        st.markdown(f"<div class='tip'>{AGENT_DESC[agent]}</div>",unsafe_allow_html=True)
+        st.markdown("<div class='section-label' style='margin-top:12px'>All agents</div>",unsafe_allow_html=True)
+        for nm in agent_names:
+            km=_route_km(ROUTES[nm]); c=AGENT_COL[nm]
+            hi=next((i for i,n in enumerate(ROUTES[nm]) if n!="Depot" and STOPS[n][2]>=4),"-")
+            st.markdown(
+                f"<div style='border-left:3px solid {c};padding:6px 10px;"
+                f"margin:4px 0;background:{'#f8fafc' if nm!=agent else c+'12'};border-radius:0 6px 6px 0'>"
+                f"<b style='font-size:.82rem'>{nm}</b> &nbsp;"
+                f"<span style='color:{MUTE};font-size:.78rem'>{km} km · 1st star: step {hi}</span>"
+                f"</div>",unsafe_allow_html=True)
+    # ── auto-play ─────────────────────────────────────────────────────────────
+    if st.session_state.get("playing") and stp<mx:
+        time.sleep(1.0/speed)
+        st.session_state.stp=stp+1
+        st.rerun()
+    elif st.session_state.get("playing") and stp>=mx:
+        st.session_state.playing=False
+# ══════════════════════════════════════════════════════════════════════════════
+#  TASK 2
+# ══════════════════════════════════════════════════════════════════════════════
+with T2:
+    st.markdown("### Are rural customers being treated fairly by the AI?")
+    st.caption("Adjust the sliders and watch the fairness score update instantly.")
+    ctrl,main=st.columns([1,3])
+    with ctrl:
+        nu=st.slider("Urban customers",100,500,300,50)
+        nr=st.slider("Rural customers",30,200,100,10)
+        k=st.slider("Groups (K-Means)",2,4,3,1)
+        fix=st.toggle("Apply fairness fix",True)
+        st.markdown(" ")
+        if fix:
+            st.markdown("""
+<div class='tip'>
+<b>What the fix does:</b><br><br>
+• Rural customers pay ~€12 more per delivery — we add this back to their spend score<br>
+• Rural customers batch orders (less frequent, bigger baskets) — we adjust their frequency<br>
+• We balance the dataset so rural customers are equally represented during training
+</div>""",unsafe_allow_html=True)
+        else:
+            st.markdown("""
+<div class='tip'>
+<b>Why bias happens:</b><br><br>
+EcoCart launched in cities first. Urban customers have more data and appear to spend more on the surface.
+The AI picks up this pattern and unfairly labels rural customers as low-value.
+</div>""",unsafe_allow_html=True)
+    with main:
+        raw=_customers(nu,nr); seg_b=_kmeans(raw,k); ub,rb,dib=_di(seg_b)
+        if fix: seg_a=_fix(nu,nr,k); ua,ra,dia=_di(seg_a)
+        # ── big fairness indicator ────────────────────────────────────────────
+        mc=st.columns(4)
+        mc[0].metric("Urban in High Value",f"{ub}%")
+        mc[1].metric("Rural in High Value",f"{rb}%")
+        di_val=dia if fix else dib
+        di_delta=f"{dia-dib:+.2f}" if fix else None
+        mc[2].metric("Fairness score",f"{di_val:.2f}",delta=di_delta,
+                     help="1.0 = perfectly equal. Aim: ≥ 0.80")
+        status="FAIR" if di_val>=0.8 else "NOT FAIR"
+        mc[3].markdown(
+            f"<div style='background:#fff;border-radius:10px;padding:14px 18px;"
+            f"box-shadow:0 1px 4px rgba(0,0,0,.07);text-align:center'>"
+            f"<div style='font-size:.8rem;color:{MUTE}'>Status</div>"
+            f"<div class='badge-{'green' if di_val>=.8 else 'red'}' "
+            f"style='font-size:.95rem;margin-top:6px'>{status}</div></div>",
+            unsafe_allow_html=True)
+        if di_val>=0.8: st.success(f"Fairness achieved — score {di_val:.2f} is above the 0.80 threshold.")
+        else:           st.error(f"Score {di_val:.2f} is below 0.80 — rural customers are under-served.")
+        # ── scatter ───────────────────────────────────────────────────────────
+        def _scatter(df,title):
+            fig=go.Figure()
+            for seg in ["High Value","Medium","Low Value","Group 4"]:
+                if seg not in df.segment.values: continue
+                for region,sym in [("urban","circle"),("rural","triangle-up")]:
+                    sub=df[(df.segment==seg)&(df.region==region)]
+                    if sub.empty: continue
+                    fig.add_trace(go.Scatter(x=sub.freq,y=sub.spend,mode="markers",
+                        marker=dict(color=SEG_COL.get(seg,"#94a3b8"),symbol=sym,size=7,opacity=.72),
+                        name=f"{seg} / {region}",
+                        hovertemplate="<b>"+seg+"</b> ("+region+")<br>Purchases: %{x:.1f}/month<br>Avg spend: €%{y:.0f}<extra></extra>"))
+            fig.update_layout(**_ch(320,title))
+            fig.update_xaxes(**_xax(title="Purchases per month"))
+            fig.update_yaxes(**_yax(title="Average spend (€)"))
+            return fig
+        if fix:
+            c1,c2=st.columns(2)
+            c1.plotly_chart(_scatter(seg_b,"Before fix — biased"),use_container_width=True)
+            c2.plotly_chart(_scatter(seg_a,"After fix — fair"),use_container_width=True)
+        else:
+            st.plotly_chart(_scatter(seg_b,"Customer groups (no fix)"),use_container_width=True)
+        # ── bar chart ─────────────────────────────────────────────────────────
+        fig2=go.Figure()
+        fig2.add_trace(go.Bar(name="Before fix",x=["Urban → High Value","Rural → High Value"],
+                              y=[ub,rb],marker_color=RED,
+                              text=[f"{ub}%",f"{rb}%"],textposition="outside",textfont_color=FG))
+        if fix:
+            fig2.add_trace(go.Bar(name="After fix",x=["Urban → High Value","Rural → High Value"],
+                                  y=[ua,ra],marker_color=GREEN,
+                                  text=[f"{ua}%",f"{ra}%"],textposition="outside",textfont_color=FG))
+        fig2.update_layout(**_ch(260,"Percentage in High Value group"),barmode="group")
+        fig2.update_xaxes(**_xax()); fig2.update_yaxes(**_yax(title="%",range=[0,110]))
+        fig2.add_hline(y=min(ub,ua if fix else ub),line_color="#94a3b8",line_dash="dot",
+                       annotation_text="Urban rate",annotation_font_color=MUTE)
+        st.plotly_chart(fig2,use_container_width=True)
+# ══════════════════════════════════════════════════════════════════════════════
+#  TASK 3
+# ═════════════════════════════════════════════════════════════════════��════════
+with T3:
+    st.markdown("### Watch the AI find the delivery route in real time")
+    st.caption("Pick start and end points, choose an algorithm, then replay how it explores the network step by step.")
+    ctrl3,map3=st.columns([1,3])
+    with ctrl3:
+        all_n=list(NODES.keys())
+        sn=st.selectbox("Start node",all_n,index=0)
+        en=st.selectbox("End node",  all_n,index=19)
+        al=st.radio("Algorithm",["BFS","DFS","A*","IDA*"],index=2,
+                    captions=["Level-by-level","Deep dive","Guided (best)","Memory-efficient"])
+        gr=st.toggle("Minimise CO₂ (not distance)",False)
+        st.divider()
+        adj=ADJ_CO2 if gr else ADJ_KM
+        unit="CO2" if gr else "km"
+        if sn==en:
+            st.warning("Choose different start and end."); path,cost,expl=[],0,[]; ms=0
+        else:
+            t0=time.perf_counter()
+            path,cost,expl=ALGOS[al](sn,en,adj)
+            ms=round((time.perf_counter()-t0)*1000,3)
+            if path:
+                st.metric("Route distance",f"{cost} {'km' if unit=='km' else 'kg CO₂'}")
+                st.metric("Nodes the AI checked",len(expl),help="The fewer the better — the AI was more efficient")
+                st.metric("Time taken",f"{ms} ms")
+                st.markdown(
+                    f"<div class='tip'><b>Route:</b> {' → '.join(path)}</div>",
+                    unsafe_allow_html=True)
+            else:
+                st.error("No route found."); path=[]; expl=[]
+    with map3:
+        # ── exploration replay slider ─────────────────────────────────────────
+        if expl:
+            replay=st.slider(
+                "🔍 Replay: drag to see how the AI explored the map",
+                0,len(expl),len(expl),
+                help="0 = no exploration shown, max = full path found")
+            explored_so_far=set(expl[:replay])
+            pct=int(replay/len(expl)*100) if expl else 100
+            st.markdown(
+                f"<div style='font-size:.82rem;color:{MUTE};margin-bottom:4px'>"
+                f"<span class='badge-blue'>{replay}/{len(expl)} nodes explored ({pct}%)</span>"
+                f"{'&nbsp;&nbsp;<span class=badge-green>Route found</span>' if replay==len(expl) and path else ''}"
+                f"</div>",unsafe_allow_html=True)
+        else:
+            explored_so_far=set()
+        fig_net=build_network(sn,en,path,explored_so_far,adj,unit,al)
+        st.plotly_chart(fig_net,use_container_width=True)
+        # colour legend
+        leg=st.columns(5)
+        leg[0].markdown(f"<div style='font-size:.78rem'>⬤ <span style='color:{RED}'>Urban node</span></div>",unsafe_allow_html=True)
+        leg[1].markdown(f"<div style='font-size:.78rem'>⬤ <span style='color:{GREEN}'>Rural node</span></div>",unsafe_allow_html=True)
+        leg[2].markdown(f"<div style='font-size:.78rem'>⬤ <span style='color:#bfdbfe'>Explored</span></div>",unsafe_allow_html=True)
+        leg[3].markdown(f"<div style='font-size:.78rem'>⬤ <span style='color:{AMBER}'>On path</span></div>",unsafe_allow_html=True)
+        leg[4].markdown(f"<div style='font-size:.78rem'>⬤ <span style='color:#fff;background:{FG};padding:1px 4px;border-radius:3px'>Start</span> / <span style='color:{FG};background:#facc15;padding:1px 4px;border-radius:3px'>End</span></div>",unsafe_allow_html=True)
+    # ── side-by-side comparison ───────────────────────────────────────────────
+    with st.expander("Compare all 4 algorithms on this route"):
+        if sn!=en:
+            rows=[]
+            for nm in ["BFS","DFS","A*","IDA*"]:
+                t0=time.perf_counter(); p,c,e=ALGOS[nm](sn,en,adj); ms2=(time.perf_counter()-t0)*1000
+                rows.append({"Algorithm":nm,
+                             f"Distance ({'km' if unit=='km' else 'CO₂'})":round(c,2) if p else "N/A",
+                             "Nodes checked":len(e),"Time (ms)":round(ms2,3),
+                             "Finds shortest?":nm in ["A*","IDA*","BFS"]})
+            df_c=pd.DataFrame(rows)
+            st.dataframe(df_c,use_container_width=True,hide_index=True)
+            fc=make_subplots(rows=1,cols=2,subplot_titles=["Nodes checked (fewer = smarter)","Time (ms)"])
+            pal=[BLUE,RED,GREEN,PURPLE]
+            for col,ci in [("Nodes checked",1),("Time (ms)",2)]:
+                fc.add_trace(go.Bar(x=df_c["Algorithm"],y=df_c[col],marker_color=pal,
+                                    text=df_c[col],textposition="outside",textfont_color=FG,
+                                    showlegend=False),row=1,col=ci)
+            fc.update_layout(paper_bgcolor=SURF,plot_bgcolor=BG,font_color=FG,height=280,
+                             margin=dict(l=40,r=20,t=50,b=30))
+            fc.update_xaxes(gridcolor=LINE); fc.update_yaxes(gridcolor=LINE)
+            st.plotly_chart(fc,use_container_width=True)
+# ══════════════════════════════════════════════════════════════════════════════
+#  TASK 4
+# ══════════════════════════════════════════════════════════════════════════════
+with T4:
+    st.markdown("### Head-to-head: A* vs IDA* on real delivery routes")
+    st.caption("We run both algorithms on 10 routes and measure speed and efficiency. Results appear as they complete.")
+    c1,c2=st.columns([1,3])
+    with c1:
+        nruns=st.slider("Timing runs per route",5,30,20,5)
+        go_btn=st.button("▶ Run the test",type="primary",use_container_width=True)
+        st.markdown("""
+<div class='tip'>
+<b>A*</b> keeps an open list in memory — very fast to find a path, but uses more RAM.<br><br>
+<b>IDA*</b> uses almost no memory — it re-searches with a tighter limit each time. Slower here but scales to huge networks.
+</div>""",unsafe_allow_html=True)
+    with c2:
+        OD_U=[("U1","U10"),("U7","U6"),("U2","U9"),("U1","U9"),("U3","U8")]
+        OD_R=[("R1","R9"),("R2","R8"),("R3","R10"),("R1","R6"),("R4","R9")]
+        if go_btn:
+            rows=[]; chart_ph=st.empty(); prog=st.progress(0); status_ph=st.empty()
+            total=(len(OD_U)+len(OD_R))*2; done=0
+            for zone,pairs in [("Urban",OD_U),("Rural",OD_R)]:
+                for s,g in pairs:
+                    for nm,fn in [("A*",astar),("IDA*",ida_star)]:
+                        times=[]
+                        p=c3=None; e=[]
+                        for _ in range(nruns):
+                            t0=time.perf_counter(); p,c3,e=fn(s,g,ADJ_KM)
+                            times.append((time.perf_counter()-t0)*1000)
+                        rows.append({"Zone":zone,"Route":f"{s}→{g}","Algorithm":nm,
+                                     "Distance (km)":c3,"Nodes checked":len(e),
+                                     "Avg time (ms)":round(sum(times)/len(times),3)})
+                        done+=1; prog.progress(done/total)
+                        status_ph.markdown(
+                            f"<span class='badge-blue'>Testing {s}→{g} with {nm}...</span>",
+                            unsafe_allow_html=True)
+                        # live chart update
+                        if len(rows)>=2:
+                            df_live=pd.DataFrame(rows)
+                            sm=df_live.groupby(["Zone","Algorithm"])[["Nodes checked","Avg time (ms)"]].mean().reset_index()
+                            fl=make_subplots(rows=1,cols=2,
+                                            subplot_titles=["Avg nodes checked","Avg time (ms)"])
+                            for anm,acl in [("A*",BLUE),("IDA*",PURPLE)]:
+                                sub=sm[sm.Algorithm==anm]
+                                if sub.empty: continue
+                                for key,ci in [("Nodes checked",1),("Avg time (ms)",2)]:
+                                    fl.add_trace(go.Bar(name=anm,x=sub["Zone"],y=sub[key].round(2),
+                                        marker_color=acl,showlegend=(ci==1),
+                                        text=sub[key].round(2),textposition="outside",
+                                        textfont_color=FG),row=1,col=ci)
+                            fl.update_layout(paper_bgcolor=SURF,plot_bgcolor=BG,font_color=FG,
+                                            barmode="group",height=320,
+                                            margin=dict(l=40,r=20,t=50,b=30),
+                                            legend=dict(bgcolor=SURF,bordercolor=LINE))
+                            fl.update_xaxes(gridcolor=LINE); fl.update_yaxes(gridcolor=LINE)
+                            chart_ph.plotly_chart(fl,use_container_width=True)
+            prog.empty(); status_ph.empty()
+            df_b=pd.DataFrame(rows)
+            st.dataframe(df_b,use_container_width=True,hide_index=True)
+            ae=df_b[df_b.Algorithm=="A*"]["Nodes checked"].mean()
+            ie=df_b[df_b.Algorithm=="IDA*"]["Nodes checked"].mean()
+            at=df_b[df_b.Algorithm=="A*"]["Avg time (ms)"].mean()
+            it=df_b[df_b.Algorithm=="IDA*"]["Avg time (ms)"].mean()
+            winner="A*" if at<it else "IDA*"
+            st.success(
+                f"**Result:** A* checked {ae:.0f} nodes on average vs IDA*'s {ie:.0f}. "
+                f"**{winner}** was faster on this map ({at:.3f} ms vs {it:.3f} ms). "
+                f"On a national road network with millions of junctions, IDA*'s near-zero memory use makes it the only practical choice.")
+        else:
+            st.info("Click **▶ Run the test** — the chart will build live as results come in.")
+# ══════════════════════════════════════════════════════���═══════════════════════
+#  TASK 5
+# ══════════════════════════════════════════════════════════════════════════════
+with T5:
+    st.markdown("### Predicting EcoCart's daily sales with machine learning")
+    st.caption("Two models trained on 2 years of data. Adjust settings and the chart updates instantly.")
+    ctrl5,main5=st.columns([1,3])
+    with ctrl5:
+        tp=st.slider("Training data",60,90,80,5,format="%d%%")
+        ne=st.slider("Random Forest trees",50,300,200,50)
+        show5=st.radio("Show",["Both","Linear Regression","Random Forest"])
+        st.divider()
+        st.markdown("<div class='section-label'>Try your own prediction</div>",unsafe_allow_html=True)
+        st.markdown("<div class='tip'>Set values for any day and see what the model predicts.</div>",
+                    unsafe_allow_html=True)
+        wi_dow=st.selectbox("Day of week",["Mon","Tue","Wed","Thu","Fri","Sat","Sun"],index=4)
+        wi_month=st.selectbox("Month",["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"],index=0)
+        wi_promo=st.toggle("Promotion running today?",False)
+        wi_lag1=st.number_input("Yesterday's sales",min_value=50,max_value=300,value=120,step=5)
+        wi_lag7=st.number_input("Sales 7 days ago",  min_value=50,max_value=300,value=115,step=5)
+    with main5:
+        with st.spinner("Training models…"):
+            lr_o,rf_o,te_df,lp,rp,imps=_train(tp,ne)
+        y=te_df["sales"].values; dates=te_df["date"].values
+        lmae,lrmse,lr2,lmape=_met(y,lp)
+        rmae,rrmse,rr2,rmape=_met(y,rp)
+        mc=st.columns(4)
+        mc[0].metric("Linear Reg accuracy (R²)",lr2)
+        mc[1].metric("Linear Reg avg error",f"±{lmae} units")
+        mc[2].metric("Random Forest accuracy (R²)",rr2,delta=f"{rr2-lr2:+.3f}")
+        mc[3].metric("Random Forest avg error",f"±{rmae} units",delta=f"{rmae-lmae:+.1f}")
+        # ── what-if prediction ────────────────────────────────────────────────
+        dow_map={"Mon":0,"Tue":1,"Wed":2,"Thu":3,"Fri":4,"Sat":5,"Sun":6}
+        mon_map={"Jan":1,"Feb":2,"Mar":3,"Apr":4,"May":5,"Jun":6,
+                 "Jul":7,"Aug":8,"Sep":9,"Oct":10,"Nov":11,"Dec":12}
+        wi_doy=int((mon_map[wi_month]-1)*30.4+15)
+        wi_r7=round((wi_lag1+wi_lag7)/2)
+        wi_r30=round((wi_lag1+wi_lag7)/2)
+        wi_row=[[dow_map[wi_dow],mon_map[wi_month],wi_doy,int(wi_promo),
+                 wi_lag1,wi_lag7,wi_lag7,wi_r7,wi_r30]]
+        wi_pred_rf=round(rf_o.predict(wi_row)[0],0)
+        wi_pred_lr=round(lr_o.predict(wi_row)[0],0)
+        wc=st.columns(3)
+        wc[0].markdown(
+            f"<div style='background:#fff;border-radius:10px;padding:14px 18px;"
+            f"box-shadow:0 1px 4px rgba(0,0,0,.07);text-align:center'>"
+            f"<div style='font-size:.78rem;color:{MUTE}'>Your scenario prediction</div>"
+            f"<div style='font-size:1.6rem;font-weight:700;color:{GREEN}'>{int(wi_pred_rf)}</div>"
+            f"<div style='font-size:.78rem;color:{MUTE}'>units (Random Forest)</div></div>",
+            unsafe_allow_html=True)
+        wc[1].markdown(
+            f"<div style='background:#fff;border-radius:10px;padding:14px 18px;"
+            f"box-shadow:0 1px 4px rgba(0,0,0,.07);text-align:center'>"
+            f"<div style='font-size:.78rem;color:{MUTE}'>Linear Regression says</div>"
+            f"<div style='font-size:1.6rem;font-weight:700;color:{BLUE}'>{int(wi_pred_lr)}</div>"
+            f"<div style='font-size:.78rem;color:{MUTE}'>units</div></div>",
+            unsafe_allow_html=True)
+        wc[2].markdown(
+            f"<div style='background:#fff;border-radius:10px;padding:14px 18px;"
+            f"box-shadow:0 1px 4px rgba(0,0,0,.07);text-align:center'>"
+            f"<div style='font-size:.78rem;color:{MUTE}'>Promotion boost</div>"
+            f"<div style='font-size:1.6rem;font-weight:700;color:{AMBER}'>{'Yes +~40' if wi_promo else 'None'}</div>"
+            f"<div style='font-size:.78rem;color:{MUTE}'>estimated extra units</div></div>",
+            unsafe_allow_html=True)
+        st.markdown(" ")
+        # ── forecast chart with range selector ───────────────────────────────
+        fig5=go.Figure()
+        fig5.add_trace(go.Scatter(x=dates,y=y,name="Actual sales",
+                                  line=dict(color=FG,width=1.5),opacity=.85,
+                                  hovertemplate="<b>Actual</b><br>%{x|%d %b %Y}<br>%{y:.0f} units<extra></extra>"))
+        if show5 in ("Both","Linear Regression"):
+            fig5.add_trace(go.Scatter(x=dates,y=lp,name="Linear Regression",
+                                      line=dict(color=BLUE,width=1.5,dash="dot"),
+                                      hovertemplate="<b>LR Prediction</b><br>%{x|%d %b %Y}<br>%{y:.0f} units<extra></extra>"))
+        if show5 in ("Both","Random Forest"):
+            fig5.add_trace(go.Scatter(x=dates,y=rp,name="Random Forest",
+                                      line=dict(color=GREEN,width=1.5),
+                                      hovertemplate="<b>RF Prediction</b><br>%{x|%d %b %Y}<br>%{y:.0f} units<extra></extra>"))
+        fig5.update_layout(**_ch(360,f"Actual vs predicted — test set ({100-tp}% of data)"))
+        fig5.update_xaxes(**_xax(title="Date",
+                          rangeselector=dict(
+                              bgcolor=SURF,
+                              buttons=[dict(count=30,label="30d",step="day",stepmode="backward"),
+                                       dict(count=60,label="60d",step="day",stepmode="backward"),
+                                       dict(count=90,label="90d",step="day",stepmode="backward"),
+                                       dict(step="all",label="All")])))
+        fig5.update_yaxes(**_yax(title="Units sold"))
+        st.plotly_chart(fig5,use_container_width=True)
+        r_col,i_col=st.columns(2)
+        with r_col:
+            fig_r=go.Figure()
+            if show5 in ("Both","Linear Regression"):
+                fig_r.add_trace(go.Scatter(x=lp,y=y-lp,mode="markers",name="Linear Reg",
+                    marker=dict(color=BLUE,size=5,opacity=.5),
+                    hovertemplate="Predicted %{x:.0f}<br>Error %{y:.0f} units<extra></extra>"))
+            if show5 in ("Both","Random Forest"):
+                fig_r.add_trace(go.Scatter(x=rp,y=y-rp,mode="markers",name="Random Forest",
+                    marker=dict(color=GREEN,size=5,opacity=.5),
+                    hovertemplate="Predicted %{x:.0f}<br>Error %{y:.0f} units<extra></extra>"))
+            fig_r.add_hline(y=0,line_color="#94a3b8",line_width=1.5,line_dash="dash")
+            fig_r.update_layout(**_ch(280,"Prediction errors  (closer to 0 = better)"))
+            fig_r.update_xaxes(**_xax(title="Predicted units"))
+            fig_r.update_yaxes(**_yax(title="Error (actual − predicted)"))
+            st.plotly_chart(fig_r,use_container_width=True)
+        with i_col:
+            imp=pd.Series(imps,index=FEATS).sort_values()
+            fi=go.Figure(go.Bar(
+                x=imp.values,
+                y=[FEAT_LABELS.get(i,i) for i in imp.index],
+                orientation="h",
+                marker=dict(color=imp.values,colorscale=[[0,"#d1fae5"],[1,GREEN]],showscale=False),
+                text=[f"{v:.3f}" for v in imp.values],
+                textposition="outside",textfont_color=FG,
+                hovertemplate="%{y}<br>Importance: %{x:.3f}<extra></extra>"))
+            fi.update_layout(**_ch(280,"What does the model rely on most?"))
+            fi.update_xaxes(**_xax(title="Importance score"))
+            fi.update_yaxes(**_yax())
+            st.plotly_chart(fi,use_container_width=True)
+        winner="Random Forest" if rr2>=lr2 else "Linear Regression"
+        st.success(
+            f"**{winner}** is more accurate (R² = {max(lr2,rr2):.3f}). "
+            f"The top predictor is **{FEAT_LABELS['lag_7']}** — because the same weekday last week "
+            f"is the single best baseline for today's sales.")
+    with st.expander("See raw prediction data"):
+        st.dataframe(pd.DataFrame({"Date":dates,"Actual":y.round(1),
+                                   "LR Prediction":lp.round(1),"RF Prediction":rp.round(1),
+                                   "LR Error":(y-lp).round(1),"RF Error":(y-rp).round(1)}),
+                     use_container_width=True)

requirements.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+streamlit>=1.57.0
+numpy>=2.0.0
+pandas>=2.0.0
+plotly>=5.0.0
+scikit-learn>=1.3.0

task2_segmentation.py ADDED Viewed

	@@ -0,0 +1,240 @@

+"""
+EcoCart Customer Segmentation — Bias Detection & Mitigation
+Task 2 — Demonstrates urban-rural bias in K-Means segmentation and
+          applies reweighing to fix it.
+NCI MSCAI | Fundamentals of AI TABA 2026
+Run:  python3 task2_segmentation.py
+Out:  bias_before_after.png, disparate_impact.png
+"""
+import numpy as np
+import pandas as pd
+import matplotlib.pyplot as plt
+from sklearn.cluster import KMeans
+from sklearn.preprocessing import StandardScaler
+RNG = np.random.default_rng(42)
+# ── 1. Generate biased customer data ────────────────────────
+# Urban customers have more data, higher frequency, higher spend — mimicking
+# a real scenario where the platform launched in cities first.
+def generate_biased_data(n_urban=300, n_rural=100):
+    # Urban: higher frequency and spend on average
+    urban = pd.DataFrame({
+        "freq":     RNG.normal(6.0, 2.0, n_urban).clip(0.5),
+        "spend":    RNG.normal(120, 40, n_urban).clip(10),
+        "recency":  RNG.exponential(10, n_urban).clip(1, 90),
+        "region":   "urban",
+    })
+    # Rural: lower frequency and spend (platform is newer there)
+    rural = pd.DataFrame({
+        "freq":     RNG.normal(3.0, 1.5, n_rural).clip(0.5),
+        "spend":    RNG.normal(65, 30, n_rural).clip(10),
+        "recency":  RNG.exponential(15, n_rural).clip(1, 90),
+        "region":   "rural",
+    })
+    df = pd.concat([urban, rural], ignore_index=True)
+    df["freq"] = df["freq"].round(1)
+    df["spend"] = df["spend"].round(0)
+    df["recency"] = df["recency"].round(0)
+    return df
+# ── 2. Segment with K-Means ────────────────────────────────
+def segment(df, features=["freq", "spend", "recency"]):
+    scaler = StandardScaler()
+    X = scaler.fit_transform(df[features])
+    km = KMeans(n_clusters=3, random_state=42, n_init=10)
+    df = df.copy()
+    df["cluster"] = km.fit_predict(X)
+    # Label clusters by mean spend (High/Medium/Low)
+    means = df.groupby("cluster")["spend"].mean().sort_values(ascending=False)
+    label_map = {means.index[0]: "High Value",
+                 means.index[1]: "Medium",
+                 means.index[2]: "Low Value"}
+    df["segment"] = df["cluster"].map(label_map)
+    return df
+# ── 3. Bias metrics ────────────────────────────────────────
+def compute_fairness(df):
+    urban = df[df.region == "urban"]
+    rural = df[df.region == "rural"]
+    u_high = (urban.segment == "High Value").mean()
+    r_high = (rural.segment == "High Value").mean()
+    di = r_high / u_high if u_high > 0 else 0
+    return {
+        "urban_high_pct": round(u_high * 100, 1),
+        "rural_high_pct": round(r_high * 100, 1),
+        "disparate_impact": round(di, 3),
+        "fair": di >= 0.8,
+    }
+# ── 4. Mitigation: reweigh + balanced re-sample ────────────
+def mitigate(df):
+    """
+    Fix 1: Balance the dataset by oversampling rural customers.
+    Fix 2: Add a 'distance_adjusted_spend' feature that normalises
+           spend by delivery cost (rural customers pay more for delivery,
+           so their raw spend understates their purchase intent).
+    Fix 3: Post-processing — reassign borderline rural customers using
+           a lowered threshold derived from the rural spend distribution.
+    """
+    df = df.copy()
+    # Oversample rural to match urban count
+    rural = df[df.region == "rural"]
+    urban = df[df.region == "urban"]
+    rural_up = rural.sample(n=len(urban), replace=True, random_state=42)
+    balanced = pd.concat([urban, rural_up], ignore_index=True)
+    # Adjust spend: rural delivery costs ~€12 more on average
+    balanced["adj_spend"] = balanced.apply(
+        lambda r: r["spend"] + 12 if r["region"] == "rural" else r["spend"],
+        axis=1,
+    )
+    # Adjust frequency: rural customers batch orders
+    balanced["adj_freq"] = balanced.apply(
+        lambda r: r["freq"] * 1.5 if r["region"] == "rural" else r["freq"],
+        axis=1,
+    )
+    # Re-segment on adjusted features
+    scaler = StandardScaler()
+    X = scaler.fit_transform(balanced[["adj_freq", "adj_spend", "recency"]])
+    km = KMeans(n_clusters=3, random_state=42, n_init=10)
+    balanced["cluster"] = km.fit_predict(X)
+    means = balanced.groupby("cluster")["adj_spend"].mean().sort_values(ascending=False)
+    label_map = {means.index[0]: "High Value",
+                 means.index[1]: "Medium",
+                 means.index[2]: "Low Value"}
+    balanced["segment"] = balanced["cluster"].map(label_map)
+    # Post-processing: promote top rural "Medium" and "Low Value" customers
+    # to "High Value" until disparate impact reaches 0.85 (above 0.8 threshold)
+    rural_mask = balanced.region == "rural"
+    urban_mask = balanced.region == "urban"
+    urban_high_rate = (balanced[urban_mask].segment == "High Value").mean()
+    target_rate = urban_high_rate * 0.85
+    n_rural = rural_mask.sum()
+    target_rural_high = int(target_rate * n_rural)
+    current_rural_high = ((balanced[rural_mask].segment == "High Value")).sum()
+    need = target_rural_high - current_rural_high
+    if need > 0:
+        # Promote from Medium first, then Low Value
+        candidates = balanced[rural_mask & (balanced.segment != "High Value")]
+        if len(candidates) > 0:
+            promote = candidates.nlargest(min(need, len(candidates)), "adj_spend").index
+            balanced.loc[promote, "segment"] = "High Value"
+    return balanced
+# ── 5. Plots ────────────────────────────────────────────────
+SEG_COLORS = {"High Value": "#10b981", "Medium": "#f59e0b", "Low Value": "#ef4444"}
+def plot_before_after(before_df, after_df, before_fair, after_fair):
+    fig, axes = plt.subplots(1, 2, figsize=(14, 5.5))
+    fig.patch.set_facecolor("#0d1117")
+    for ax, df, fair, title in [
+        (axes[0], before_df, before_fair, "BEFORE mitigation (biased)"),
+        (axes[1], after_df,  after_fair,  "AFTER mitigation (reweighed + adjusted)"),
+    ]:
+        ax.set_facecolor("#0d1117")
+        for seg in ["High Value", "Medium", "Low Value"]:
+            mask = df.segment == seg
+            for region, marker in [("urban", "o"), ("rural", "^")]:
+                rmask = mask & (df.region == region)
+                ax.scatter(df.loc[rmask, "freq"], df.loc[rmask, "spend"],
+                           c=SEG_COLORS[seg], marker=marker, s=25, alpha=0.6,
+                           label=f"{seg} ({region})" if ax == axes[0] else None)
+        di = fair["disparate_impact"]
+        color = "#ef4444" if not fair["fair"] else "#10b981"
+        ax.set_title(f"{title}\nDI = {di:.3f} {'⚠ BIASED' if not fair['fair'] else '✓ FAIR'}",
+                     color="white", fontsize=11)
+        ax.set_xlabel("Purchase frequency / month", color="white")
+        ax.set_ylabel("Avg spend (€)", color="white")
+        ax.tick_params(colors="white")
+        ax.grid(True, alpha=0.1, color="white")
+    axes[0].legend(fontsize=7, facecolor="#0d1117", edgecolor="#334155",
+                   labelcolor="white", loc="upper right", ncol=2)
+    plt.tight_layout()
+    plt.savefig("output/bias_before_after.png", dpi=150,
+                bbox_inches="tight", facecolor="#0d1117")
+    plt.close()
+def plot_di(before_fair, after_fair):
+    fig, ax = plt.subplots(figsize=(8, 4))
+    fig.patch.set_facecolor("#0d1117")
+    ax.set_facecolor("#0d1117")
+    cats = ["Urban → High", "Rural → High", "Disparate Impact"]
+    before_vals = [before_fair["urban_high_pct"], before_fair["rural_high_pct"],
+                   before_fair["disparate_impact"] * 100]
+    after_vals  = [after_fair["urban_high_pct"],  after_fair["rural_high_pct"],
+                   after_fair["disparate_impact"] * 100]
+    x = range(len(cats))
+    w = 0.35
+    ax.bar([i - w/2 for i in x], before_vals, w, label="Before", color="#ef4444", alpha=0.85)
+    ax.bar([i + w/2 for i in x], after_vals,  w, label="After",  color="#10b981", alpha=0.85)
+    ax.axhline(80, color="#fbbf24", linewidth=1.5, linestyle="--", label="DI threshold (80%)")
+    ax.set_xticks(x)
+    ax.set_xticklabels(cats, color="white")
+    ax.set_ylabel("Percentage", color="white")
+    ax.set_title("Fairness metrics before vs after mitigation", color="white", fontsize=12)
+    ax.tick_params(colors="white")
+    ax.legend(fontsize=9, facecolor="#0d1117", edgecolor="#334155", labelcolor="white")
+    ax.grid(True, axis="y", alpha=0.15, color="white")
+    plt.tight_layout()
+    plt.savefig("output/disparate_impact.png", dpi=150,
+                bbox_inches="tight", facecolor="#0d1117")
+    plt.close()
+# ── 6. Main ─────────────────────────────────────────────────
+def main():
+    print("="*70)
+    print("EcoCart Customer Segmentation — Bias Detection & Mitigation")
+    print("="*70)
+    # Generate and segment (biased)
+    df = generate_biased_data()
+    df = segment(df)
+    before = compute_fairness(df)
+    print(f"\nBEFORE mitigation:")
+    print(f"  Urban -> High Value: {before['urban_high_pct']}%")
+    print(f"  Rural -> High Value: {before['rural_high_pct']}%")
+    print(f"  Disparate Impact:   {before['disparate_impact']}")
+    print(f"  Fair (DI >= 0.8)?   {before['fair']}")
+    print(f"\n  Segment counts:")
+    ct = df.groupby(["region", "segment"]).size().unstack(fill_value=0)
+    print(ct.to_string(index=True))
+    # Mitigate
+    fixed = mitigate(df)
+    after = compute_fairness(fixed)
+    print(f"\nAFTER mitigation:")
+    print(f"  Urban -> High Value: {after['urban_high_pct']}%")
+    print(f"  Rural -> High Value: {after['rural_high_pct']}%")
+    print(f"  Disparate Impact:   {after['disparate_impact']}")
+    print(f"  Fair (DI >= 0.8)?   {after['fair']}")
+    # Plots
+    plot_before_after(df, fixed, before, after)
+    plot_di(before, after)
+    print("\nWrote: bias_before_after.png, disparate_impact.png")
+if __name__ == "__main__":
+    main()

task3_4_routing.py ADDED Viewed

	@@ -0,0 +1,333 @@

+"""
+EcoCart Route Optimisation Prototype
+Tasks 3 & 4 — BFS, DFS, A*, IDA* on a weighted delivery network
+              + Green Routing mode (CO2-weighted edges for sustainability)
+NCI MSCAI | Fundamentals of AI TABA 2026
+Run:  python3 task3_4_routing.py
+Out:  network_map.png, algo_comparison.png, green_vs_fast.png
+"""
+import heapq, math, time, tracemalloc, statistics
+from collections import deque
+import matplotlib.pyplot as plt
+import matplotlib.patches as mpatches
+import networkx as nx
+# ── 1. Network ──────────────────────────────────────────────
+NODES = {
+    # Urban cluster (dense, short edges)
+    "U1":(1.0,1.0,"urban"),"U2":(2.0,1.5,"urban"),"U3":(3.0,1.0,"urban"),
+    "U4":(1.5,2.5,"urban"),"U5":(2.5,3.0,"urban"),"U6":(3.5,2.0,"urban"),
+    "U7":(1.0,3.5,"urban"),"U8":(2.0,4.0,"urban"),"U9":(3.0,4.0,"urban"),
+    "U10":(4.0,3.5,"urban"),
+    # Rural cluster (sparse, long edges)
+    "R1":(6.0,1.0,"rural"),"R2":(8.0,2.0,"rural"),"R3":(10.0,1.5,"rural"),
+    "R4":(7.0,4.0,"rural"),"R5":(9.0,4.5,"rural"),"R6":(11.0,3.5,"rural"),
+    "R7":(6.5,6.0,"rural"),"R8":(9.0,7.0,"rural"),"R9":(11.0,6.0,"rural"),
+    "R10":(8.0,5.5,"rural"),
+}
+def _dist(a, b):
+    return math.hypot(NODES[a][0]-NODES[b][0], NODES[a][1]-NODES[b][1])
+_PAIRS = [
+    ("U1","U2"),("U2","U3"),("U1","U4"),("U2","U4"),("U2","U5"),
+    ("U3","U6"),("U4","U5"),("U5","U6"),("U4","U7"),("U5","U8"),
+    ("U6","U10"),("U7","U8"),("U8","U9"),("U9","U10"),("U5","U9"),
+    ("R1","R2"),("R2","R3"),("R1","R4"),("R2","R4"),("R3","R6"),
+    ("R4","R5"),("R5","R6"),("R4","R7"),("R5","R10"),("R7","R10"),
+    ("R7","R8"),("R8","R9"),("R6","R9"),("R8","R10"),("R5","R8"),
+    ("U3","R1"),("U10","R4"),("U6","R1"),("U9","R7"),
+]
+# Road distance ≈ 1.15× straight-line
+EDGES = [(a, b, round(_dist(a,b)*1.15, 2)) for a, b in _PAIRS]
+# CO2 cost per edge: urban roads have traffic → higher emissions per km
+# Rural roads: 0.12 kg CO2/km;  Urban roads: 0.21 kg CO2/km
+def _co2(a, b, km):
+    za, zb = NODES[a][2], NODES[b][2]
+    rate = 0.28 if za == "urban" and zb == "urban" else 0.18 if za != zb else 0.10
+    return round(km * rate, 3)
+CO2_EDGES = [(a, b, _co2(a, b, w)) for a, b, w in EDGES]
+ADJ_KM = {n: [] for n in NODES}
+ADJ_CO2 = {n: [] for n in NODES}
+for i, (a, b, w) in enumerate(EDGES):
+    ADJ_KM[a].append((b, w))
+    ADJ_KM[b].append((a, w))
+    co2 = CO2_EDGES[i][2]
+    ADJ_CO2[a].append((b, co2))
+    ADJ_CO2[b].append((a, co2))
+# ── 2. Algorithms ───────────────────────────────────────────
+def heuristic(n, goal, scale=1.0):
+    return _dist(n, goal) * scale
+def bfs(start, goal, adj=ADJ_KM):
+    expanded = 0
+    q = deque([(start, [start])])
+    seen = {start}
+    while q:
+        node, path = q.popleft()
+        expanded += 1
+        if node == goal:
+            cost = sum(_edge_w(path[i], path[i+1], adj) for i in range(len(path)-1))
+            return path, round(cost, 2), expanded
+        for nb, _ in adj[node]:
+            if nb not in seen:
+                seen.add(nb)
+                q.append((nb, path + [nb]))
+    return None, math.inf, expanded
+def dfs(start, goal, adj=ADJ_KM, depth_limit=50):
+    expanded = 0
+    stack = [(start, [start])]
+    seen = {start}
+    while stack:
+        node, path = stack.pop()
+        expanded += 1
+        if node == goal:
+            cost = sum(_edge_w(path[i], path[i+1], adj) for i in range(len(path)-1))
+            return path, round(cost, 2), expanded
+        if len(path) > depth_limit:
+            continue
+        for nb, _ in adj[node]:
+            if nb not in seen:
+                seen.add(nb)
+                stack.append((nb, path + [nb]))
+    return None, math.inf, expanded
+def astar(start, goal, adj=ADJ_KM, h_scale=1.0):
+    expanded, counter = 0, 0
+    heap = [(heuristic(start, goal, h_scale), 0.0, counter, start, [start])]
+    best = {start: 0.0}
+    while heap:
+        f, g, _, node, path = heapq.heappop(heap)
+        if node == goal:
+            return path, round(g, 2), expanded
+        if g > best.get(node, math.inf):
+            continue
+        expanded += 1
+        for nb, w in adj[node]:
+            ng = g + w
+            if ng < best.get(nb, math.inf):
+                best[nb] = ng
+                counter += 1
+                heapq.heappush(heap, (ng + heuristic(nb, goal, h_scale), ng, counter, nb, path + [nb]))
+    return None, math.inf, expanded
+def ida_star(start, goal, adj=ADJ_KM, h_scale=1.0):
+    expanded = [0]
+    def _dfs(node, g, bound, path, visited):
+        f = g + heuristic(node, goal, h_scale)
+        if f > bound:
+            return None, f
+        expanded[0] += 1
+        if node == goal:
+            return list(path), g
+        nxt = math.inf
+        for nb, w in adj[node]:
+            if nb in visited:
+                continue
+            visited.add(nb)
+            path.append(nb)
+            r, t = _dfs(nb, g + w, bound, path, visited)
+            if r is not None:
+                return r, t
+            if t < nxt:
+                nxt = t
+            path.pop()
+            visited.remove(nb)
+        return None, nxt
+    bound = heuristic(start, goal, h_scale)
+    while True:
+        r, t = _dfs(start, 0.0, bound, [start], {start})
+        if r is not None:
+            return r, round(t, 2), expanded[0]
+        if t == math.inf:
+            return None, math.inf, expanded[0]
+        bound = t
+def _edge_w(a, b, adj):
+    for nb, w in adj[a]:
+        if nb == b:
+            return w
+    return math.inf
+# ── 3. Benchmark ────────────────────────────────────────────
+def benchmark(algo, start, goal, adj=ADJ_KM, repeats=20):
+    times, mems = [], []
+    path = cost = expanded = None
+    for _ in range(repeats):
+        tracemalloc.start()
+        t0 = time.perf_counter()
+        path, cost, expanded = algo(start, goal, adj)
+        t1 = time.perf_counter()
+        _, peak = tracemalloc.get_traced_memory()
+        tracemalloc.stop()
+        times.append((t1 - t0) * 1000)
+        mems.append(peak / 1024)
+    return {
+        "ms": round(statistics.mean(times), 3),
+        "kb": round(statistics.mean(mems), 2),
+        "expanded": expanded,
+        "cost": cost,
+        "path": path,
+    }
+OD_URBAN = [("U1","U10"),("U7","U6"),("U2","U9"),("U1","U9"),("U3","U8")]
+OD_RURAL = [("R1","R9"),("R2","R8"),("R3","R10"),("R1","R6"),("R4","R9")]
+# ── 4. Plots ────────────────────────────────────────────────
+def plot_network():
+    G = nx.Graph()
+    for n, (x, y, _) in NODES.items():
+        G.add_node(n, pos=(x, y))
+    for a, b, w in EDGES:
+        G.add_edge(a, b, weight=w)
+    pos = {n: (NODES[n][0], NODES[n][1]) for n in NODES}
+    colors = ["#ef4444" if NODES[n][2] == "urban" else "#10b981" for n in NODES]
+    fig, ax = plt.subplots(figsize=(13, 6))
+    ax.set_facecolor("#0d1117")
+    fig.patch.set_facecolor("#0d1117")
+    nx.draw(G, pos, ax=ax, with_labels=True, node_color=colors, node_size=500,
+            font_size=8, font_weight="bold", font_color="white",
+            edge_color="#334155", width=1.2)
+    labels = {(a, b): f"{w}" for a, b, w in EDGES}
+    nx.draw_networkx_edge_labels(G, pos, ax=ax, edge_labels=labels,
+                                 font_size=6, font_color="#94a3b8")
+    urban_patch = mpatches.Patch(color="#ef4444", label="Urban node")
+    rural_patch = mpatches.Patch(color="#10b981", label="Rural node")
+    ax.legend(handles=[urban_patch, rural_patch], loc="upper left",
+              fontsize=9, facecolor="#0d1117", edgecolor="#334155", labelcolor="white")
+    ax.set_title("EcoCart 20-node delivery network (edge labels = km)",
+                 color="white", fontsize=12, pad=12)
+    plt.tight_layout()
+    plt.savefig("output/network_map.png", dpi=150, bbox_inches="tight",
+                facecolor="#0d1117")
+    plt.close()
+def plot_comparison(results):
+    metrics = [("Runtime (ms)", "ms"), ("Nodes expanded", "expanded"), ("Peak memory (KB)", "kb")]
+    fig, axes = plt.subplots(1, 3, figsize=(15, 4.5))
+    fig.patch.set_facecolor("#0d1117")
+    for ax, (title, key) in zip(axes, metrics):
+        ax.set_facecolor("#0d1117")
+        u_a = statistics.mean(r["astar"][key] for r in results["urban"])
+        u_i = statistics.mean(r["ida"][key]   for r in results["urban"])
+        r_a = statistics.mean(r["astar"][key] for r in results["rural"])
+        r_i = statistics.mean(r["ida"][key]   for r in results["rural"])
+        x = [0, 1]
+        w = 0.32
+        ax.bar([xi - w/2 for xi in x], [u_a, r_a], w, label="A*",   color="#3b82f6")
+        ax.bar([xi + w/2 for xi in x], [u_i, r_i], w, label="IDA*", color="#8b5cf6")
+        ax.set_xticks(x)
+        ax.set_xticklabels(["Urban", "Rural"], color="white")
+        ax.set_title(title, color="white", fontsize=11)
+        ax.tick_params(colors="white")
+        ax.grid(True, axis="y", alpha=0.15, color="white")
+        ax.legend(fontsize=9, facecolor="#0d1117", edgecolor="#334155", labelcolor="white")
+    plt.suptitle("A* vs IDA* (mean over 5 O-D pairs × 20 runs)",
+                 color="white", fontsize=12)
+    plt.tight_layout()
+    plt.savefig("output/algo_comparison.png", dpi=150,
+                bbox_inches="tight", facecolor="#0d1117")
+    plt.close()
+def plot_green_vs_fast():
+    """Compare fastest route (A* on km) vs greenest route (A* on CO2)."""
+    pairs = [("U1", "R9"), ("U7", "R6"), ("R1", "U10")]
+    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
+    fig.patch.set_facecolor("#0d1117")
+    G = nx.Graph()
+    for n, (x, y, _) in NODES.items():
+        G.add_node(n, pos=(x, y))
+    for a, b, w in EDGES:
+        G.add_edge(a, b)
+    pos = {n: (NODES[n][0], NODES[n][1]) for n in NODES}
+    for ax, (s, g) in zip(axes, pairs):
+        ax.set_facecolor("#0d1117")
+        fast_path, fast_km, _ = astar(s, g, ADJ_KM)
+        green_path, green_co2, _ = astar(s, g, ADJ_CO2, h_scale=0.10)
+        # Compute cross-metrics
+        fast_co2 = sum(_edge_w(fast_path[i], fast_path[i+1], ADJ_CO2) for i in range(len(fast_path)-1))
+        green_km = sum(_edge_w(green_path[i], green_path[i+1], ADJ_KM) for i in range(len(green_path)-1))
+        colors = ["#ef4444" if NODES[n][2] == "urban" else "#10b981" for n in NODES]
+        nx.draw(G, pos, ax=ax, with_labels=True, node_color=colors,
+                node_size=300, font_size=7, font_weight="bold",
+                font_color="white", edge_color="#1e293b", width=0.8)
+        fast_edges = [(fast_path[i], fast_path[i+1]) for i in range(len(fast_path)-1)]
+        green_edges = [(green_path[i], green_path[i+1]) for i in range(len(green_path)-1)]
+        nx.draw_networkx_edges(G, pos, ax=ax, edgelist=fast_edges,
+                               edge_color="#f59e0b", width=3, alpha=0.8)
+        nx.draw_networkx_edges(G, pos, ax=ax, edgelist=green_edges,
+                               edge_color="#22c55e", width=3, style="dashed", alpha=0.8)
+        ax.set_title(f"{s} → {g}\nFast: {fast_km:.1f}km / {fast_co2:.2f}kg CO₂\n"
+                     f"Green: {green_km:.1f}km / {green_co2:.2f}kg CO₂",
+                     color="white", fontsize=9, linespacing=1.4)
+    fast_patch = mpatches.Patch(color="#f59e0b", label="Fastest (min km)")
+    green_patch = mpatches.Patch(color="#22c55e", label="Greenest (min CO₂)")
+    fig.legend(handles=[fast_patch, green_patch], loc="lower center",
+               ncol=2, fontsize=10, facecolor="#0d1117", edgecolor="#334155",
+               labelcolor="white")
+    plt.suptitle("Fast Route vs Green Route — same A*, different cost function",
+                 color="white", fontsize=12)
+    plt.tight_layout(rect=[0, 0.06, 1, 0.95])
+    plt.savefig("output/green_vs_fast.png", dpi=150,
+                bbox_inches="tight", facecolor="#0d1117")
+    plt.close()
+# ── 5. Main ─────────────────────────────────────────────────
+def main():
+    print("="*70)
+    print("EcoCart Route Optimisation — A* vs IDA* benchmark")
+    print("="*70)
+    # Smoke test all four
+    for name, fn in [("BFS", bfs), ("DFS", dfs), ("A*", astar), ("IDA*", ida_star)]:
+        path, cost, exp = fn("U1", "U10")
+        print(f"  {name:5s} U1->U10  cost={cost:.2f} km  expanded={exp}")
+    # Full benchmark A* vs IDA*
+    results = {"urban": [], "rural": []}
+    for label, pairs in [("urban", OD_URBAN), ("rural", OD_RURAL)]:
+        print(f"\n--- {label.upper()} benchmark ---")
+        for s, g in pairs:
+            a = benchmark(astar, s, g)
+            i = benchmark(ida_star, s, g)
+            results[label].append({"pair": (s, g), "astar": a, "ida": i})
+            print(f"  {s}->{g}:  A* {a['cost']:.2f}km/{a['expanded']}exp/{a['ms']:.3f}ms  "
+                  f"IDA* {i['cost']:.2f}km/{i['expanded']}exp/{i['ms']:.3f}ms")
+            assert abs(a["cost"] - i["cost"]) < 1e-4, "Optimality mismatch"
+    # Green routing demo
+    print("\n--- GREEN ROUTING ---")
+    for s, g in [("U1","R9"), ("U7","R6")]:
+        fp, fk, _ = astar(s, g, ADJ_KM)
+        gp, gc, _ = astar(s, g, ADJ_CO2, h_scale=0.10)
+        fco2 = sum(_edge_w(fp[i], fp[i+1], ADJ_CO2) for i in range(len(fp)-1))
+        gkm  = sum(_edge_w(gp[i], gp[i+1], ADJ_KM) for i in range(len(gp)-1))
+        print(f"  {s}->{g}  Fast: {fk:.1f}km/{fco2:.2f}kgCO2  Green: {gkm:.1f}km/{gc:.2f}kgCO2")
+    # Generate plots
+    plot_network()
+    plot_comparison(results)
+    plot_green_vs_fast()
+    print("\nWrote: network_map.png, algo_comparison.png, green_vs_fast.png")
+if __name__ == "__main__":
+    main()

task5_forecasting.py ADDED Viewed

	@@ -0,0 +1,137 @@

+"""
+EcoCart Demand Forecasting Prototype
+Task 5 — Linear Regression vs Random Forest on synthetic daily sales.
+NCI MSCAI | Fundamentals of AI TABA 2026
+Run:  python3 task5_forecasting.py
+Out:  forecast.png, residuals.png, feature_importance.png
+"""
+import numpy as np
+import pandas as pd
+import matplotlib.pyplot as plt
+from sklearn.linear_model import LinearRegression
+from sklearn.ensemble import RandomForestRegressor
+from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
+RNG = np.random.default_rng(42)
+# ── 1. Synthetic sales data ────────────────────────────────
+def generate_sales(days=730):
+    t = np.arange(days)
+    dates = pd.date_range("2023-01-01", periods=days, freq="D")
+    base   = 100 + 0.05 * t
+    weekly = 25 * np.sin(2 * np.pi * t / 7)
+    yearly = 40 * np.sin(2 * np.pi * t / 365)
+    noise  = RNG.normal(0, 8, days)
+    promo  = np.zeros(days)
+    promo[RNG.choice(days, int(days * 0.06), replace=False)] = RNG.uniform(30, 70, int(days * 0.06))
+    sales = np.clip(base + weekly + yearly + noise + promo, 0, None)
+    return pd.DataFrame({
+        "date": dates, "sales": sales,
+        "dow": dates.dayofweek, "month": dates.month,
+        "day_of_year": dates.dayofyear,
+        "is_promo": (promo > 0).astype(int),
+    })
+# ── 2. Features ────────────────────────────────────────────
+def add_features(df):
+    out = df.copy()
+    for lag in [1, 7, 14]:
+        out[f"lag_{lag}"] = out["sales"].shift(lag)
+    out["roll_7"]  = out["sales"].shift(1).rolling(7).mean()
+    out["roll_30"] = out["sales"].shift(1).rolling(30).mean()
+    return out.dropna().reset_index(drop=True)
+FEATURES = ["dow", "month", "day_of_year", "is_promo",
+            "lag_1", "lag_7", "lag_14", "roll_7", "roll_30"]
+# ── 3. Train & evaluate ───────────────────────────────────
+def evaluate(name, y_true, y_pred):
+    mae  = mean_absolute_error(y_true, y_pred)
+    rmse = mean_squared_error(y_true, y_pred) ** 0.5
+    r2   = r2_score(y_true, y_pred)
+    mape = np.mean(np.abs((y_true - y_pred) / np.where(y_true == 0, 1, y_true))) * 100
+    print(f"  {name:<22s}  MAE={mae:6.2f}  RMSE={rmse:6.2f}  R²={r2:.3f}  MAPE={mape:.2f}%")
+    return {"mae": mae, "rmse": rmse, "r2": r2, "mape": mape}
+def main():
+    print("="*70)
+    print("EcoCart Demand Forecasting — LR vs Random Forest")
+    print("="*70)
+    df = generate_sales()
+    df = add_features(df)
+    split = int(len(df) * 0.8)
+    train, test = df.iloc[:split], df.iloc[split:]
+    X_tr, y_tr = train[FEATURES], train["sales"]
+    X_te, y_te = test[FEATURES],  test["sales"]
+    print(f"Train: {len(train)} days  Test: {len(test)} days")
+    lr = LinearRegression().fit(X_tr, y_tr)
+    rf = RandomForestRegressor(n_estimators=200, max_depth=12,
+                               min_samples_leaf=3, random_state=42,
+                               n_jobs=-1).fit(X_tr, y_tr)
+    lr_pred = lr.predict(X_te)
+    rf_pred = rf.predict(X_te)
+    print("\nTest-set metrics:")
+    lr_m = evaluate("Linear Regression", y_te.values, lr_pred)
+    rf_m = evaluate("Random Forest",     y_te.values, rf_pred)
+    # ── Plots ──
+    plt.rcParams.update({"axes.facecolor":"#0d1117","figure.facecolor":"#0d1117",
+                         "text.color":"white","axes.labelcolor":"white",
+                         "xtick.color":"white","ytick.color":"white"})
+    # Forecast
+    fig, ax = plt.subplots(figsize=(13, 5))
+    ax.plot(test.date, y_te, color="#e2e8f0", lw=1.3, label="Actual")
+    ax.plot(test.date, lr_pred, color="#3b82f6", lw=1, alpha=0.8, label="Linear Regression")
+    ax.plot(test.date, rf_pred, color="#10b981", lw=1, alpha=0.8, label="Random Forest")
+    ax.set_title("Test-set: actual vs predicted daily demand", fontsize=12)
+    ax.set_xlabel("Date"); ax.set_ylabel("Units sold")
+    ax.legend(fontsize=9, facecolor="#0d1117", edgecolor="#334155", labelcolor="white")
+    ax.grid(True, alpha=0.1)
+    plt.tight_layout()
+    plt.savefig("output/forecast.png", dpi=150, bbox_inches="tight")
+    plt.close()
+    # Residuals
+    fig, axes = plt.subplots(1, 2, figsize=(13, 4.5))
+    for ax, pred, name, color, m in [
+        (axes[0], lr_pred, "Linear Regression", "#3b82f6", lr_m),
+        (axes[1], rf_pred, "Random Forest",     "#10b981", rf_m),
+    ]:
+        ax.scatter(pred, y_te.values - pred, s=12, c=color, alpha=0.6)
+        ax.axhline(0, color="white", lw=0.8)
+        ax.set_title(f"{name} residuals (RMSE={m['rmse']:.2f})", fontsize=11)
+        ax.set_xlabel("Predicted"); ax.set_ylabel("Residual")
+        ax.grid(True, alpha=0.1)
+    plt.tight_layout()
+    plt.savefig("output/residuals.png", dpi=150, bbox_inches="tight")
+    plt.close()
+    # Feature importance
+    imp = pd.Series(rf.feature_importances_, index=FEATURES).sort_values()
+    fig, ax = plt.subplots(figsize=(8, 4.5))
+    ax.barh(imp.index, imp.values, color="#10b981")
+    ax.set_title("Random Forest — feature importance", fontsize=12)
+    ax.set_xlabel("Importance")
+    ax.grid(True, axis="x", alpha=0.1)
+    plt.tight_layout()
+    plt.savefig("output/feature_importance.png", dpi=150, bbox_inches="tight")
+    plt.close()
+    print(f"\nTop features: {', '.join(imp.index[-3:][::-1])}")
+    print("Wrote: forecast.png, residuals.png, feature_importance.png")
+if __name__ == "__main__":
+    main()