Scribbler310 commited on
Commit
a985b94
·
0 Parent(s):

Production deployment with LFS models

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .dockerignore +9 -0
  2. .gitattributes +8 -0
  3. .gitignore +28 -0
  4. Dockerfile +21 -0
  5. README.md +87 -0
  6. backend/Dockerfile +19 -0
  7. backend/ingest_knowledge.py +65 -0
  8. backend/main.py +257 -0
  9. backend/requirements.txt +8 -0
  10. dataset.yaml +19 -0
  11. docker-compose.yml +23 -0
  12. frontend/.dockerignore +3 -0
  13. frontend/.gitignore +24 -0
  14. frontend/Assets/Gorilla_Chest_Thumping_Animation_Generated.mp4 +3 -0
  15. frontend/Assets/freesound_community-monkey-30631.mp3 +3 -0
  16. frontend/Assets/g.png +3 -0
  17. frontend/Dockerfile +20 -0
  18. frontend/README.md +16 -0
  19. frontend/eslint.config.js +29 -0
  20. frontend/index.html +16 -0
  21. frontend/package-lock.json +0 -0
  22. frontend/package.json +31 -0
  23. frontend/public/favicon.svg +1 -0
  24. frontend/public/icons.svg +24 -0
  25. frontend/src/App.css +184 -0
  26. frontend/src/App.jsx +68 -0
  27. frontend/src/apiConfig.js +3 -0
  28. frontend/src/assets/hero.png +3 -0
  29. frontend/src/assets/react.svg +1 -0
  30. frontend/src/assets/vite.svg +1 -0
  31. frontend/src/components/ChatBot.jsx +131 -0
  32. frontend/src/components/HistoricalAnalytics.jsx +150 -0
  33. frontend/src/components/KPICard.jsx +14 -0
  34. frontend/src/components/MaterialPredictor.jsx +73 -0
  35. frontend/src/index.css +490 -0
  36. frontend/src/main.jsx +10 -0
  37. frontend/vite.config.js +7 -0
  38. middleware/EDA_wafer_control_db.ipynb +604 -0
  39. middleware/__init__.py +0 -0
  40. middleware/best.pt +3 -0
  41. middleware/dashboard.py +369 -0
  42. middleware/database.py +0 -0
  43. middleware/material_model.pkl +3 -0
  44. middleware/material_predictor.py +229 -0
  45. middleware/robot_controller.py +278 -0
  46. middleware/wafer_control.db +3 -0
  47. notebooks/01_data_exploration.ipynb +399 -0
  48. requirements.txt +16 -0
  49. src/__init__.py +0 -0
  50. src/batch_inference.py +19 -0
.dockerignore ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ venv/
2
+ env/
3
+ .env
4
+ backend/chroma_db/
5
+ backend/__pycache__/
6
+ middleware/__pycache__/
7
+ frontend/node_modules/
8
+ frontend/dist/
9
+ .git/
.gitattributes ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ *.db filter=lfs diff=lfs merge=lfs -text
2
+ *.pt filter=lfs diff=lfs merge=lfs -text
3
+ *.png filter=lfs diff=lfs merge=lfs -text
4
+ *.jpg filter=lfs diff=lfs merge=lfs -text
5
+ *.mp4 filter=lfs diff=lfs merge=lfs -text
6
+ *.mp3 filter=lfs diff=lfs merge=lfs -text
7
+ *.pkl filter=lfs diff=lfs merge=lfs -text
8
+ *.cache filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Virtual Environment
2
+ venv/
3
+ env/
4
+ .env
5
+
6
+ # Python Caches
7
+ __pycache__/
8
+ *.py[cod]
9
+ *$py.class
10
+
11
+ # Large data & model files
12
+ data/
13
+ runs/
14
+ *.cache
15
+ *.db-journal
16
+ *.sqlite3-journal
17
+
18
+ # Keep these for production
19
+ !middleware/wafer_control.db
20
+ !middleware/material_model.pkl
21
+ !middleware/best.pt
22
+
23
+
24
+ # OS generated files
25
+ .DS_Store
26
+
27
+ # Vector DB
28
+ backend/chroma_db/
Dockerfile ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.12-slim
2
+
3
+ WORKDIR /app
4
+
5
+ # Install system dependencies
6
+ RUN apt-get update && apt-get install -y build-essential && rm -rf /var/lib/apt/lists/*
7
+
8
+ # Copy and install dependencies
9
+ COPY backend/requirements.txt ./backend/
10
+ RUN pip install --no-cache-dir -r backend/requirements.txt
11
+
12
+ # Copy everything needed
13
+ COPY backend/ ./backend/
14
+ COPY middleware/ ./middleware/
15
+ COPY runs/ ./runs/
16
+
17
+ # Hugging Face requires port 7860
18
+ EXPOSE 7860
19
+
20
+ # Start the server
21
+ CMD ["uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "7860"]
README.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Semiconductor Wafer Defect Detection: End-to-End AI Pipeline
2
+
3
+ ## Project Overview
4
+ This project is a complete, end-to-end Applied AI pipeline designed for the semiconductor manufacturing industry. It takes raw mathematical array data representing defective semiconductor wafers, engineers them into an AI-ready computer vision dataset, trains a custom YOLOv8 object detection model, and feeds the results into a predictive material waste model and real-time dashboard.
5
+
6
+ **Final YOLOv8 Model Performance:** `0.962 mAP@50` (96.2% overall accuracy on unseen validation data).
7
+ **Predictive Waste Model Performance:** `R² = 0.9637` (Highly accurate material waste prediction).
8
+
9
+ ## Business Value
10
+ In semiconductor fabrication, identifying microscopic defects early in the manufacturing process saves millions in scrapped materials. This project automates quality control by transitioning from manual coordinate analysis to real-time, AI-driven visual defect detection, while simultaneously forecasting future material waste to optimize supply chain planning.
11
+
12
+ ## The Technical Pipeline
13
+
14
+ ### Phase 1: Data Engineering (`src/data_prep.py`)
15
+ * **The Challenge:** The original dataset consisted of raw `.txt` files containing numeric 2D arrays (0=background, 1=good chip, 2=defect). YOLOv8 cannot read text arrays; it requires physical images and normalized bounding box coordinates.
16
+ * **The Solution:** Built a custom Python pipeline using `NumPy` and `OpenCV` to parse over 25,000 text files.
17
+ * **The Math:** Programmatically identified the spatial extremes (`xmin`, `ymin`, `xmax`, `ymax`) of the `2` values, normalized them to YOLO's strict `0.0 - 1.0` format, and dynamically rendered high-contrast `.jpg` images alongside corresponding `.txt` label files.
18
+
19
+ ### Phase 2: Dataset Architecture (`src/split_data.py`)
20
+ * Used `scikit-learn` to execute a mathematically rigorous 80/20 train/validation split.
21
+ * Programmatically generated the strict directory architecture required by YOLO, migrating over 50,000 individual files into structured `train` and `val` directories.
22
+
23
+ ### Phase 3: Model Training (`src/model_train.py`)
24
+ * Initialized a pre-trained **YOLOv8 Nano** (`yolov8n.pt`) model for lightweight, high-speed inference.
25
+ * Trained on 20,415 wafer images for 10 epochs.
26
+ * Mapped 8 specific manufacturing defect classes (Center, Donut, Edge-Loc, Edge-Ring, Loc, Random, Scratch, Near-full).
27
+
28
+ ### Phase 4: Batch Inference & Evaluation (`src/batch_inference.py` & `src/model_eval.py`)
29
+ * Deployed the custom-trained `best.pt` weights to run batch inference on unseen validation images.
30
+ * Model successfully drew accurate bounding boxes and assigned confidence scores entirely autonomously.
31
+
32
+ ### Phase 5: Production Middleware, Predictive Modeling & Dashboard
33
+ * **Robotic Scanner Simulation (`middleware/robot_controller.py`):** Operates on a massive hybrid dataset of **823,953 wafers** (Mixed-type + WM-811K datasets) with a realistic 95.5% pass rate. It automatically routes passed wafers and runs YOLOv8 inference on defective ones, logging everything into a centralized SQLite database (`wafer_control.db`).
34
+ * **Material Waste Predictor (`middleware/material_predictor.py`):** A Random Forest Regressor trained on the historical scan database. It accurately predicts the average percentage of material wasted within defective wafers, allowing fabs to estimate future material needs.
35
+ * **Real-time Dashboard (`middleware/dashboard.py`):** A **Plotly Dash** web application that visualizes historical defect rates, defect distributions, routing actions, and integrates interactive material forecasting inputs.
36
+
37
+ ## Upcoming Feature: LLM Troubleshooting Assistant (Planned)
38
+ **Goal:** Integrate an intelligent Large Language Model (LLM) bot to assist fab engineers directly on the factory floor.
39
+ * **Functionality:** When the dashboard flags a sudden spike in a specific defect type (e.g., "Edge-Ring" defects), the engineer can consult the LLM bot.
40
+ * **Use Case:** The bot will analyze the defect trends, cross-reference historical manufacturing guidelines, and suggest potential root causes (such as misaligned etching tools or incorrect gas pressure), drastically reducing troubleshooting and downtime.
41
+ *(Note: This feature is currently in the design phase and not yet implemented).*
42
+
43
+ ## Performance Metrics
44
+ The YOLOv8 model achieved phenomenal results on the blind validation set:
45
+
46
+ | Metric | Score | Note |
47
+ | :--- | :--- | :--- |
48
+ | **mAP50 (All Classes)** | **96.2%** | Overall model accuracy at a 50% confidence threshold. |
49
+ | **Recall** | **93.1%** | The model successfully located 93.1% of all physical defects. |
50
+ | **Edge-Ring (mAP50)** | **99.4%** | Near-flawless detection of Edge-Ring anomalies. |
51
+
52
+ The Random Forest Material Waste Predictor achieved:
53
+ | Metric | Score | Note |
54
+ | :--- | :--- | :--- |
55
+ | **R² Score** | **0.9637** | Excellent correlation on predictive targets. |
56
+ | **MAE** | **0.09%** | Average prediction error is less than one-tenth of a percent. |
57
+
58
+ ## Tech Stack
59
+ * **Languages:** Python
60
+ * **Computer Vision:** Ultralytics (YOLOv8), OpenCV (`cv2`)
61
+ * **Machine Learning & Data:** Pandas, NumPy, Scikit-learn, SQLite
62
+ * **Web UI & Visualization:** Plotly, Dash
63
+
64
+ ## Deployment (Docker)
65
+
66
+ This application is fully containerized for easy deployment.
67
+
68
+ 1. **Clone the repository:**
69
+ ```bash
70
+ git clone https://github.com/Udayan2001/Semiconductor_defect_detection.git
71
+ cd Semiconductor_defect_detection
72
+ ```
73
+ 2. **Add API Key:**
74
+ Create a `.env` file in the `backend/` directory and add your Google Gemini API key:
75
+ ```
76
+ GEMINI_API_KEY=your_api_key_here
77
+ ```
78
+ 3. **Start the Application:**
79
+ Run the following command from the root directory to build and start both the backend and frontend servers:
80
+ ```bash
81
+ docker compose up --build
82
+ ```
83
+ 4. **Access the Dashboard:**
84
+ Open your browser and navigate to `http://localhost:5173`.
85
+
86
+ ---
87
+ *Designed and engineered by Udayan Shashank Shukla.*
backend/Dockerfile ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.12-slim
2
+
3
+ WORKDIR /app
4
+
5
+ # Install essential system dependencies
6
+ RUN apt-get update && apt-get install -y build-essential && rm -rf /var/lib/apt/lists/*
7
+
8
+ # Copy and install python dependencies
9
+ COPY backend/requirements.txt ./backend/
10
+ RUN pip install --no-cache-dir -r backend/requirements.txt
11
+
12
+ # Copy backend and middleware code
13
+ COPY backend/ ./backend/
14
+ COPY middleware/ ./middleware/
15
+
16
+ EXPOSE 8000
17
+
18
+ # Run ingestion first to ensure vector DB is seeded, then start the server
19
+ CMD ["sh", "-c", "python backend/ingest_knowledge.py && uvicorn backend.main:app --host 0.0.0.0 --port 8000"]
backend/ingest_knowledge.py ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import chromadb
3
+ from chromadb.config import Settings
4
+
5
+ # Engineering knowledge base regarding semiconductor wafer defects
6
+ KNOWLEDGE_BASE = [
7
+ {
8
+ "id": "defect_edge_ring",
9
+ "title": "Edge-Ring Defect Troubleshooting",
10
+ "content": "Edge-Ring defects typically appear as a continuous ring of failing dies around the outer edge of the wafer. Common Root Causes: 1. Uneven gas distribution in the etching chamber. 2. Non-uniform chuck temperature during deposition or etching. 3. Edge-bead removal issues during photolithography. Recommended Action: Inspect gas flow regulators and recalibrate chuck temperature sensors. Schedule maintenance for edge-bead removal module."
11
+ },
12
+ {
13
+ "id": "defect_center",
14
+ "title": "Center Defect Troubleshooting",
15
+ "content": "Center defects are concentrated in the middle of the wafer. Common Root Causes: 1. Poor spin-coating uniformity (photoresist pooling in the center). 2. Center-heavy deposition profile. 3. Excessive center heating on the electrostatic chuck. Recommended Action: Verify spin speed and acceleration in the coating track. Check gas showerhead for clogging in the center region."
16
+ },
17
+ {
18
+ "id": "defect_scratch",
19
+ "title": "Scratch Defect Troubleshooting",
20
+ "content": "Scratch defects manifest as linear patterns of failing dies, often crossing the wafer. Common Root Causes: 1. Mechanical handling damage by robotic arms or end-effectors. 2. Particulate contamination causing dragging during CMP (Chemical Mechanical Polishing). 3. Cassette or FOUP abrasion. Recommended Action: Check robot alignment and end-effector cleanliness. Inspect CMP pad conditioning and slurry filtration system."
21
+ },
22
+ {
23
+ "id": "defect_donut",
24
+ "title": "Donut Defect Troubleshooting",
25
+ "content": "Donut defects appear as a ring, but not at the very edge (like Edge-Ring), leaving the center and extreme edge relatively clean. Common Root Causes: 1. Radially dependent temperature non-uniformity during rapid thermal processing (RTP). 2. Specific gas flow dynamics creating standing waves or depletion zones in the chamber. Recommended Action: Recalibrate RTP lamp zones. Inspect gas showerhead and exhaust pumping symmetry."
26
+ },
27
+ {
28
+ "id": "general_forecast_strategy",
29
+ "title": "Material Forecast & Yield Strategy",
30
+ "content": "When the predicted material waste percentage rises above 5%, the factory must proactively increase raw material orders (wafers, photoresist, precursor gases) for the next quarter to compensate for the lower yield. High fail rates typically necessitate a temporary slow-down of production throughput to allow for deep tool maintenance and recalibration."
31
+ }
32
+ ]
33
+
34
+ def ingest_data():
35
+ print("Initializing ChromaDB Persistent Client...")
36
+ db_path = os.path.join(os.path.dirname(__file__), "chroma_db")
37
+ client = chromadb.PersistentClient(path=db_path)
38
+
39
+ # Create or get collection
40
+ collection = client.get_or_create_collection(
41
+ name="semiconductor_knowledge",
42
+ metadata={"hnsw:space": "cosine"}
43
+ )
44
+
45
+ # Clear existing data if any (for idempotency)
46
+ existing_ids = collection.get()['ids']
47
+ if existing_ids:
48
+ collection.delete(ids=existing_ids)
49
+
50
+ # Prepare data for insertion
51
+ ids = [item['id'] for item in KNOWLEDGE_BASE]
52
+ documents = [item['content'] for item in KNOWLEDGE_BASE]
53
+ metadatas = [{"title": item['title']} for item in KNOWLEDGE_BASE]
54
+
55
+ print(f"Adding {len(documents)} documents to the knowledge base...")
56
+ collection.add(
57
+ documents=documents,
58
+ metadatas=metadatas,
59
+ ids=ids
60
+ )
61
+
62
+ print("Ingestion complete. ChromaDB is ready.")
63
+
64
+ if __name__ == "__main__":
65
+ ingest_data()
backend/main.py ADDED
@@ -0,0 +1,257 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import sys
2
+ import os
3
+ import pickle
4
+ import sqlite3
5
+ import pandas as pd
6
+ from fastapi import FastAPI
7
+ from fastapi.middleware.cors import CORSMiddleware
8
+ from pydantic import BaseModel
9
+ from dotenv import load_dotenv
10
+ from google import genai
11
+ import chromadb
12
+ from typing import List, Dict
13
+
14
+ env_path = os.path.join(os.path.dirname(__file__), '.env')
15
+ load_dotenv(env_path)
16
+
17
+ # Add parent dir to path so we can import from middleware
18
+ sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
19
+ from middleware.material_predictor import predict_material_needs
20
+
21
+ app = FastAPI(title="Wafer Defect API")
22
+
23
+ app.add_middleware(
24
+ CORSMiddleware,
25
+ allow_origins=["*"],
26
+ allow_credentials=True,
27
+ allow_methods=["*"],
28
+ allow_headers=["*"],
29
+ )
30
+
31
+ # Robust paths for Docker/Hosting
32
+ BASE_DIR = os.path.dirname(os.path.abspath(__file__))
33
+ DB_PATH = os.path.join(BASE_DIR, '..', 'middleware', 'wafer_control.db')
34
+ MODEL_PATH = os.path.join(BASE_DIR, '..', 'middleware', 'material_model.pkl')
35
+ CHROMA_PATH = os.path.join(BASE_DIR, 'chroma_db')
36
+
37
+ # Ensure directories exist
38
+ os.makedirs(CHROMA_PATH, exist_ok=True)
39
+
40
+ DEFECT_COLORS = {
41
+ 'Center': '#ef4444', 'Donut': '#f59e0b', 'Edge-Loc': '#10b981',
42
+ 'Edge-Ring': '#3b82f6', 'Loc': '#8b5cf6', 'Random': '#ec4899',
43
+ 'Scratch': '#06b6d4', 'Near-full': '#f97316', 'None': '#6b7280',
44
+ 'Undetected': '#374151',
45
+ }
46
+
47
+ # Globally load data so we don't block requests
48
+ df = pd.DataFrame()
49
+ if os.path.exists(DB_PATH):
50
+ print(f"Loading DB from {DB_PATH}...")
51
+ conn = sqlite3.connect(DB_PATH)
52
+ df = pd.read_sql_query("SELECT * FROM wafer_logs", conn)
53
+ conn.close()
54
+ df['scan_time'] = pd.to_datetime(df['scan_time'])
55
+ df['scan_date'] = df['scan_time'].dt.date
56
+ else:
57
+ print(f"Warning: DB not found at {DB_PATH}. Dashboard will be empty.")
58
+
59
+ # Setup Vector DB and LLM
60
+ print(f"Connecting to ChromaDB at {CHROMA_PATH}...")
61
+ try:
62
+ chroma_client = chromadb.PersistentClient(path=CHROMA_PATH)
63
+ collection = chroma_client.get_or_create_collection(name="semiconductor_knowledge")
64
+ except Exception as e:
65
+ print(f"Warning: Could not connect to ChromaDB collection. Error: {e}")
66
+ collection = None
67
+
68
+ print("Initializing Gemini API...")
69
+ gemini_client = None
70
+ if os.getenv("GEMINI_API_KEY"):
71
+ gemini_client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
72
+ else:
73
+ print("Warning: GEMINI_API_KEY not found in environment.")
74
+
75
+ print("Loading ML model...")
76
+ model_pkg = None
77
+ if os.path.exists(MODEL_PATH):
78
+ with open(MODEL_PATH, 'rb') as f:
79
+ model_pkg = pickle.load(f)
80
+
81
+ @app.get("/api/kpi")
82
+ def get_kpis():
83
+ total_scans = len(df)
84
+ fail_df = df[df['status'] == 'FAIL']
85
+ fail_count = len(fail_df)
86
+ pass_count = len(df[df['status'] == 'PASS'])
87
+ pass_rate = round((pass_count / total_scans) * 100, 1) if total_scans else 0
88
+ scrap_count = len(df[df['action'] == 'ROUTE_TO_SCRAP'])
89
+ avg_waste = round(fail_df['material_wasted_pct'].mean(), 2) if fail_count else 0
90
+ avg_confidence = round(fail_df['confidence'].mean(), 2) if fail_count else 0
91
+
92
+ return {
93
+ "total_scans": total_scans,
94
+ "pass_count": pass_count,
95
+ "pass_rate": pass_rate,
96
+ "fail_count": fail_count,
97
+ "fail_rate": round(100 - pass_rate, 1),
98
+ "scrap_count": scrap_count,
99
+ "avg_waste": avg_waste,
100
+ "avg_confidence": avg_confidence
101
+ }
102
+
103
+ @app.get("/api/charts/defects")
104
+ def get_defects():
105
+ fail_df = df[df['status'] == 'FAIL']
106
+ defect_counts = fail_df['defect_type'].value_counts().reset_index()
107
+ defect_counts.columns = ['defect_type', 'count']
108
+
109
+ gt_counts = fail_df['ground_truth'].value_counts().reset_index()
110
+ gt_counts.columns = ['ground_truth', 'count']
111
+
112
+ return {
113
+ "predictions": defect_counts.to_dict(orient="records"),
114
+ "ground_truth": gt_counts.head(15).to_dict(orient="records")
115
+ }
116
+
117
+ @app.get("/api/charts/waste")
118
+ def get_waste():
119
+ fail_df = df[df['status'] == 'FAIL']
120
+
121
+ waste_by_type = fail_df.groupby('defect_type').agg(
122
+ total_waste=('material_wasted_pct', lambda x: x.sum() / 100.0)
123
+ ).reset_index().sort_values('total_waste', ascending=True)
124
+
125
+ action_counts = df['action'].value_counts().reset_index()
126
+ action_counts.columns = ['action', 'count']
127
+
128
+ return {
129
+ "waste_by_type": waste_by_type.to_dict(orient="records"),
130
+ "actions": action_counts.to_dict(orient="records")
131
+ }
132
+
133
+ @app.get("/api/charts/trends")
134
+ def get_trends():
135
+ daily = df.groupby('scan_date').agg(
136
+ scans=('id', 'count'),
137
+ fails=('status', lambda x: (x == 'FAIL').sum()),
138
+ waste=('material_wasted_pct', lambda x: x.sum() / 100.0)
139
+ ).reset_index()
140
+ daily['fail_rate'] = round((daily['fails'] / daily['scans']) * 100, 1)
141
+
142
+ return {
143
+ "dates": daily['scan_date'].astype(str).tolist(),
144
+ "fail_rate": daily['fail_rate'].tolist(),
145
+ "waste": daily['waste'].tolist()
146
+ }
147
+
148
+ @app.get("/api/model/status")
149
+ def model_status():
150
+ if not model_pkg:
151
+ return {"loaded": False}
152
+
153
+ m = model_pkg['metrics']
154
+ imp = model_pkg['metrics']['importances']
155
+ imp_df = pd.DataFrame({'feature': list(imp.keys()), 'importance': list(imp.values())})
156
+ imp_df = imp_df.sort_values('importance', ascending=True).tail(10)
157
+
158
+ return {
159
+ "loaded": True,
160
+ "metrics": {"r2": round(m['r2'], 4), "mae": round(m['mae'], 2)},
161
+ "importance": imp_df.to_dict(orient="records")
162
+ }
163
+
164
+ class PredictionRequest(BaseModel):
165
+ scans: int
166
+ fail_rate: float
167
+
168
+ @app.post("/api/predict")
169
+ def predict_waste(req: PredictionRequest):
170
+ if not model_pkg:
171
+ return {"error": "No model loaded"}
172
+
173
+ fail_df = df[df['status'] == 'FAIL']
174
+ dist = fail_df['defect_type'].value_counts(normalize=True).to_dict()
175
+
176
+ pred = predict_material_needs(model_pkg['model'], model_pkg['feature_cols'], req.scans, req.fail_rate / 100.0, dist)
177
+ pred['fail_rate'] = req.fail_rate
178
+ return pred
179
+
180
+
181
+ class ChatMessage(BaseModel):
182
+ role: str
183
+ content: str
184
+
185
+ class ChatRequest(BaseModel):
186
+ messages: List[ChatMessage]
187
+
188
+ @app.post("/api/chat")
189
+ def chat_with_bot(req: ChatRequest):
190
+ if not gemini_client:
191
+ return {"error": "Gemini API key not configured"}
192
+
193
+ user_message = req.messages[-1].content if req.messages else ""
194
+
195
+ # 1. RAG Retrieval from ChromaDB
196
+ context_docs = ""
197
+ if collection and user_message:
198
+ try:
199
+ results = collection.query(query_texts=[user_message], n_results=2)
200
+ if results and results['documents'] and results['documents'][0]:
201
+ context_docs = "\n".join(results['documents'][0])
202
+ except Exception as e:
203
+ print(f"ChromaDB Query Error: {e}")
204
+
205
+ # 2. Get Live Dashboard Context
206
+ total_scans = len(df)
207
+ fail_df = df[df['status'] == 'FAIL']
208
+ fail_count = len(fail_df)
209
+ pass_rate = round(((total_scans - fail_count) / total_scans) * 100, 1) if total_scans else 0
210
+ top_defects = fail_df['defect_type'].value_counts().head(3).to_dict()
211
+
212
+ live_kpis = f"""
213
+ Current Dashboard State:
214
+ - Total Wafers Scanned: {total_scans}
215
+ - Current Pass Rate: {pass_rate}%
216
+ - Total Defective Wafers: {fail_count}
217
+ - Top Defect Types Right Now: {top_defects}
218
+ """
219
+
220
+ # 3. Construct System Prompt
221
+ system_instruction = f"""
222
+ You are the 'Gorilla Semiconductors Engineering Assistant', an expert semiconductor manufacturing assistant.
223
+ You help engineers understand dashboard data and troubleshoot wafer defects.
224
+ Maintain a strictly professional, analytical, and authoritative engineering tone.
225
+
226
+ Here is the LIVE DATA from the dashboard:
227
+ {live_kpis}
228
+
229
+ Here is retrieved technical context from our engineering database based on the user's query:
230
+ {context_docs if context_docs else "No specific engineering docs retrieved."}
231
+
232
+ Use the live data to answer questions about 'current status' or 'dashboard'.
233
+ Use the engineering docs to answer questions about 'why' a defect happens.
234
+ """
235
+
236
+ try:
237
+ # Convert messages to format expected by google-genai
238
+ contents = []
239
+ for msg in req.messages:
240
+ role = "user" if msg.role == "user" else "model"
241
+ contents.append(
242
+ genai.types.Content(role=role, parts=[genai.types.Part.from_text(text=msg.content)])
243
+ )
244
+
245
+ response = gemini_client.models.generate_content(
246
+ model='gemini-2.5-flash-lite',
247
+ contents=contents,
248
+ config=genai.types.GenerateContentConfig(
249
+ system_instruction=system_instruction,
250
+ temperature=0.3
251
+ )
252
+ )
253
+ return {"response": response.text}
254
+ except Exception as e:
255
+ print(f"Gemini API Error: {e}")
256
+ return {"error": str(e)}
257
+
backend/requirements.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ fastapi>=0.100.0
2
+ uvicorn>=0.22.0
3
+ pandas>=2.0.0
4
+ scikit-learn>=1.3.0
5
+ pydantic>=2.0.0
6
+ google-genai>=0.3.0
7
+ chromadb>=0.4.24
8
+ python-dotenv>=1.0.0
dataset.yaml ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # YOLOv8 Dataset Configuration File
2
+
3
+ # The base path to your dataset folder
4
+ path: data/yolo_dataset
5
+
6
+ # The subfolders for training and validation images
7
+ train: images/train
8
+ val: images/val
9
+
10
+ # The 8 defect classes we mapped earlier
11
+ names:
12
+ 0: Center
13
+ 1: Donut
14
+ 2: Edge-Loc
15
+ 3: Edge-Ring
16
+ 4: Loc
17
+ 5: Random
18
+ 6: Scratch
19
+ 7: Near-full
docker-compose.yml ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ services:
2
+ backend:
3
+ build:
4
+ context: .
5
+ dockerfile: backend/Dockerfile
6
+ ports:
7
+ - "8000:8000"
8
+ # Ensure the container has access to the environment variables
9
+ env_file:
10
+ - ./backend/.env
11
+ # Optional: Mount the SQLite DB so changes persist
12
+ volumes:
13
+ - ./middleware/wafer_control.db:/app/middleware/wafer_control.db
14
+
15
+ frontend:
16
+ build:
17
+ context: ./frontend
18
+ dockerfile: Dockerfile
19
+ ports:
20
+ # Map the Nginx internal port 80 to 5173 to match the dev environment
21
+ - "5173:80"
22
+ depends_on:
23
+ - backend
frontend/.dockerignore ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ node_modules/
2
+ dist/
3
+ .env
frontend/.gitignore ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Logs
2
+ logs
3
+ *.log
4
+ npm-debug.log*
5
+ yarn-debug.log*
6
+ yarn-error.log*
7
+ pnpm-debug.log*
8
+ lerna-debug.log*
9
+
10
+ node_modules
11
+ dist
12
+ dist-ssr
13
+ *.local
14
+
15
+ # Editor directories and files
16
+ .vscode/*
17
+ !.vscode/extensions.json
18
+ .idea
19
+ .DS_Store
20
+ *.suo
21
+ *.ntvs*
22
+ *.njsproj
23
+ *.sln
24
+ *.sw?
frontend/Assets/Gorilla_Chest_Thumping_Animation_Generated.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:95a1c3d740e86a5ec1c9ef8f26063000280eb1219d173ecbc2c4ca6c3d5ccbe8
3
+ size 945018
frontend/Assets/freesound_community-monkey-30631.mp3 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4960fd96e5623ecc3f90c262e78e1f63ba7bb01e619cc52104e59a80fbc49c16
3
+ size 429600
frontend/Assets/g.png ADDED

Git LFS Details

  • SHA256: 16967011f43355d0888a71975a277f0d555f0a7ce17b9e473283c8fb9553db72
  • Pointer size: 132 Bytes
  • Size of remote file: 4.83 MB
frontend/Dockerfile ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Build stage
2
+ FROM node:20-alpine AS builder
3
+
4
+ WORKDIR /app
5
+
6
+ # Install dependencies
7
+ COPY package*.json ./
8
+ RUN npm install
9
+
10
+ # Copy source code and build
11
+ COPY . .
12
+ RUN npm run build
13
+
14
+ # Production stage
15
+ FROM nginx:alpine
16
+ # Copy built assets to Nginx
17
+ COPY --from=builder /app/dist /usr/share/nginx/html
18
+
19
+ EXPOSE 80
20
+ CMD ["nginx", "-g", "daemon off;"]
frontend/README.md ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # React + Vite
2
+
3
+ This template provides a minimal setup to get React working in Vite with HMR and some ESLint rules.
4
+
5
+ Currently, two official plugins are available:
6
+
7
+ - [@vitejs/plugin-react](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react) uses [Oxc](https://oxc.rs)
8
+ - [@vitejs/plugin-react-swc](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react-swc) uses [SWC](https://swc.rs/)
9
+
10
+ ## React Compiler
11
+
12
+ The React Compiler is not enabled on this template because of its impact on dev & build performances. To add it, see [this documentation](https://react.dev/learn/react-compiler/installation).
13
+
14
+ ## Expanding the ESLint configuration
15
+
16
+ If you are developing a production application, we recommend using TypeScript with type-aware lint rules enabled. Check out the [TS template](https://github.com/vitejs/vite/tree/main/packages/create-vite/template-react-ts) for information on how to integrate TypeScript and [`typescript-eslint`](https://typescript-eslint.io) in your project.
frontend/eslint.config.js ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import js from '@eslint/js'
2
+ import globals from 'globals'
3
+ import reactHooks from 'eslint-plugin-react-hooks'
4
+ import reactRefresh from 'eslint-plugin-react-refresh'
5
+ import { defineConfig, globalIgnores } from 'eslint/config'
6
+
7
+ export default defineConfig([
8
+ globalIgnores(['dist']),
9
+ {
10
+ files: ['**/*.{js,jsx}'],
11
+ extends: [
12
+ js.configs.recommended,
13
+ reactHooks.configs.flat.recommended,
14
+ reactRefresh.configs.vite,
15
+ ],
16
+ languageOptions: {
17
+ ecmaVersion: 2020,
18
+ globals: globals.browser,
19
+ parserOptions: {
20
+ ecmaVersion: 'latest',
21
+ ecmaFeatures: { jsx: true },
22
+ sourceType: 'module',
23
+ },
24
+ },
25
+ rules: {
26
+ 'no-unused-vars': ['error', { varsIgnorePattern: '^[A-Z_]' }],
27
+ },
28
+ },
29
+ ])
frontend/index.html ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!doctype html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8" />
5
+ <link rel="icon" type="image/svg+xml" href="/favicon.svg" />
6
+ <meta name="viewport" content="width=device-width, initial-scale=1.0" />
7
+ <link rel="preconnect" href="https://fonts.googleapis.com">
8
+ <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
9
+ <link href="https://fonts.googleapis.com/css2?family=Fredoka:wght@400;500;600;700&display=swap" rel="stylesheet">
10
+ <title>Gorilla Semiconductors</title>
11
+ </head>
12
+ <body>
13
+ <div id="root"></div>
14
+ <script type="module" src="/src/main.jsx"></script>
15
+ </body>
16
+ </html>
frontend/package-lock.json ADDED
The diff for this file is too large to render. See raw diff
 
frontend/package.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "frontend",
3
+ "private": true,
4
+ "version": "0.0.0",
5
+ "type": "module",
6
+ "scripts": {
7
+ "dev": "vite",
8
+ "build": "vite build",
9
+ "lint": "eslint .",
10
+ "preview": "vite preview"
11
+ },
12
+ "dependencies": {
13
+ "axios": "^1.14.0",
14
+ "chart.js": "^4.5.1",
15
+ "lucide-react": "^1.7.0",
16
+ "react": "^19.2.4",
17
+ "react-chartjs-2": "^5.3.1",
18
+ "react-dom": "^19.2.4"
19
+ },
20
+ "devDependencies": {
21
+ "@eslint/js": "^9.39.4",
22
+ "@types/react": "^19.2.14",
23
+ "@types/react-dom": "^19.2.3",
24
+ "@vitejs/plugin-react": "^6.0.1",
25
+ "eslint": "^9.39.4",
26
+ "eslint-plugin-react-hooks": "^7.0.1",
27
+ "eslint-plugin-react-refresh": "^0.5.2",
28
+ "globals": "^17.4.0",
29
+ "vite": "^8.0.1"
30
+ }
31
+ }
frontend/public/favicon.svg ADDED
frontend/public/icons.svg ADDED
frontend/src/App.css ADDED
@@ -0,0 +1,184 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .counter {
2
+ font-size: 16px;
3
+ padding: 5px 10px;
4
+ border-radius: 5px;
5
+ color: var(--accent);
6
+ background: var(--accent-bg);
7
+ border: 2px solid transparent;
8
+ transition: border-color 0.3s;
9
+ margin-bottom: 24px;
10
+
11
+ &:hover {
12
+ border-color: var(--accent-border);
13
+ }
14
+ &:focus-visible {
15
+ outline: 2px solid var(--accent);
16
+ outline-offset: 2px;
17
+ }
18
+ }
19
+
20
+ .hero {
21
+ position: relative;
22
+
23
+ .base,
24
+ .framework,
25
+ .vite {
26
+ inset-inline: 0;
27
+ margin: 0 auto;
28
+ }
29
+
30
+ .base {
31
+ width: 170px;
32
+ position: relative;
33
+ z-index: 0;
34
+ }
35
+
36
+ .framework,
37
+ .vite {
38
+ position: absolute;
39
+ }
40
+
41
+ .framework {
42
+ z-index: 1;
43
+ top: 34px;
44
+ height: 28px;
45
+ transform: perspective(2000px) rotateZ(300deg) rotateX(44deg) rotateY(39deg)
46
+ scale(1.4);
47
+ }
48
+
49
+ .vite {
50
+ z-index: 0;
51
+ top: 107px;
52
+ height: 26px;
53
+ width: auto;
54
+ transform: perspective(2000px) rotateZ(300deg) rotateX(40deg) rotateY(39deg)
55
+ scale(0.8);
56
+ }
57
+ }
58
+
59
+ #center {
60
+ display: flex;
61
+ flex-direction: column;
62
+ gap: 25px;
63
+ place-content: center;
64
+ place-items: center;
65
+ flex-grow: 1;
66
+
67
+ @media (max-width: 1024px) {
68
+ padding: 32px 20px 24px;
69
+ gap: 18px;
70
+ }
71
+ }
72
+
73
+ #next-steps {
74
+ display: flex;
75
+ border-top: 1px solid var(--border);
76
+ text-align: left;
77
+
78
+ & > div {
79
+ flex: 1 1 0;
80
+ padding: 32px;
81
+ @media (max-width: 1024px) {
82
+ padding: 24px 20px;
83
+ }
84
+ }
85
+
86
+ .icon {
87
+ margin-bottom: 16px;
88
+ width: 22px;
89
+ height: 22px;
90
+ }
91
+
92
+ @media (max-width: 1024px) {
93
+ flex-direction: column;
94
+ text-align: center;
95
+ }
96
+ }
97
+
98
+ #docs {
99
+ border-right: 1px solid var(--border);
100
+
101
+ @media (max-width: 1024px) {
102
+ border-right: none;
103
+ border-bottom: 1px solid var(--border);
104
+ }
105
+ }
106
+
107
+ #next-steps ul {
108
+ list-style: none;
109
+ padding: 0;
110
+ display: flex;
111
+ gap: 8px;
112
+ margin: 32px 0 0;
113
+
114
+ .logo {
115
+ height: 18px;
116
+ }
117
+
118
+ a {
119
+ color: var(--text-h);
120
+ font-size: 16px;
121
+ border-radius: 6px;
122
+ background: var(--social-bg);
123
+ display: flex;
124
+ padding: 6px 12px;
125
+ align-items: center;
126
+ gap: 8px;
127
+ text-decoration: none;
128
+ transition: box-shadow 0.3s;
129
+
130
+ &:hover {
131
+ box-shadow: var(--shadow);
132
+ }
133
+ .button-icon {
134
+ height: 18px;
135
+ width: 18px;
136
+ }
137
+ }
138
+
139
+ @media (max-width: 1024px) {
140
+ margin-top: 20px;
141
+ flex-wrap: wrap;
142
+ justify-content: center;
143
+
144
+ li {
145
+ flex: 1 1 calc(50% - 8px);
146
+ }
147
+
148
+ a {
149
+ width: 100%;
150
+ justify-content: center;
151
+ box-sizing: border-box;
152
+ }
153
+ }
154
+ }
155
+
156
+ #spacer {
157
+ height: 88px;
158
+ border-top: 1px solid var(--border);
159
+ @media (max-width: 1024px) {
160
+ height: 48px;
161
+ }
162
+ }
163
+
164
+ .ticks {
165
+ position: relative;
166
+ width: 100%;
167
+
168
+ &::before,
169
+ &::after {
170
+ content: '';
171
+ position: absolute;
172
+ top: -4.5px;
173
+ border: 5px solid transparent;
174
+ }
175
+
176
+ &::before {
177
+ left: 0;
178
+ border-left-color: var(--border);
179
+ }
180
+ &::after {
181
+ right: 0;
182
+ border-right-color: var(--border);
183
+ }
184
+ }
frontend/src/App.jsx ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import React, { useState } from 'react';
2
+ import './index.css';
3
+ import { HistoricalAnalytics } from './components/HistoricalAnalytics.jsx';
4
+ import { MaterialPredictor } from './components/MaterialPredictor.jsx';
5
+ import { ChatBot } from './components/ChatBot.jsx';
6
+ import logo from '../Assets/g.png';
7
+ import thumpVideo from '../Assets/Gorilla_Chest_Thumping_Animation_Generated.mp4';
8
+
9
+ function App() {
10
+ const [activeTab, setActiveTab] = useState('waste');
11
+ const [showChat, setShowChat] = useState(false);
12
+ const [showVideo, setShowVideo] = useState(false);
13
+
14
+ const handleGorillaClick = () => {
15
+ setShowVideo(true);
16
+ setShowChat(false);
17
+ };
18
+
19
+ return (
20
+ <>
21
+ <div className="dashboard-header">
22
+ <div className="header-title-container">
23
+ <img
24
+ src={logo}
25
+ alt="Gorilla Semiconductors Logo"
26
+ className="gorilla-logo"
27
+ onClick={handleGorillaClick}
28
+ />
29
+ <h1 className="header-title">Gorilla Semiconductors</h1>
30
+ </div>
31
+ </div>
32
+
33
+ <div className="tabs-container">
34
+ <button
35
+ className={`tab-btn ${activeTab === 'waste' ? 'active' : ''}`}
36
+ onClick={() => setActiveTab('waste')}
37
+ >
38
+ Historical Waste Analysis
39
+ </button>
40
+ <button
41
+ className={`tab-btn ${activeTab === 'predict' ? 'active' : ''}`}
42
+ onClick={() => setActiveTab('predict')}
43
+ >
44
+ Material Prediction
45
+ </button>
46
+ </div>
47
+
48
+ <div style={{ flex: 1, minHeight: 0, display: 'flex', flexDirection: 'column' }}>
49
+ {activeTab === 'waste' ? <HistoricalAnalytics /> : <MaterialPredictor />}
50
+ </div>
51
+
52
+ <ChatBot isOpen={showChat} onClose={() => setShowChat(false)} />
53
+
54
+ {showVideo && (
55
+ <div className="thump-video-overlay" onClick={() => { setShowVideo(false); setShowChat(true); }}>
56
+ <video
57
+ src={thumpVideo}
58
+ autoPlay
59
+ className="thump-video"
60
+ onEnded={() => { setShowVideo(false); setShowChat(true); }}
61
+ />
62
+ </div>
63
+ )}
64
+ </>
65
+ );
66
+ }
67
+
68
+ export default App;
frontend/src/apiConfig.js ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ const API_BASE_URL = import.meta.env.VITE_API_URL || 'http://localhost:8000';
2
+
3
+ export default API_BASE_URL;
frontend/src/assets/hero.png ADDED

Git LFS Details

  • SHA256: 72a860570eddf1dd9988f26c7106c67be286bc9f2fd3303c465ce87edb1ae6cd
  • Pointer size: 130 Bytes
  • Size of remote file: 44.9 kB
frontend/src/assets/react.svg ADDED
frontend/src/assets/vite.svg ADDED
frontend/src/components/ChatBot.jsx ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import React, { useState, useRef, useEffect } from 'react';
2
+ import axios from 'axios';
3
+
4
+ import API_BASE_URL from '../apiConfig';
5
+
6
+ export const ChatBot = ({ isOpen, onClose }) => {
7
+ const [messages, setMessages] = useState([]);
8
+ const [input, setInput] = useState('');
9
+ const [isLoading, setIsLoading] = useState(false);
10
+ const messagesEndRef = useRef(null);
11
+
12
+ // Dragging state
13
+ const [position, setPosition] = useState({
14
+ x: typeof window !== 'undefined' ? window.innerWidth - 400 : 0,
15
+ y: typeof window !== 'undefined' ? window.innerHeight - 650 : 0
16
+ });
17
+ const [isDragging, setIsDragging] = useState(false);
18
+ const dragRef = useRef({ startX: 0, startY: 0, initialX: 0, initialY: 0 });
19
+
20
+ const scrollToBottom = () => {
21
+ messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
22
+ };
23
+
24
+ useEffect(() => {
25
+ scrollToBottom();
26
+ }, [messages]);
27
+
28
+ // Handle Dragging
29
+ const handlePointerDown = (e) => {
30
+ // Don't drag if clicking the close button
31
+ if (e.target.tagName.toLowerCase() === 'button') return;
32
+ setIsDragging(true);
33
+ dragRef.current = {
34
+ startX: e.clientX,
35
+ startY: e.clientY,
36
+ initialX: position.x,
37
+ initialY: position.y
38
+ };
39
+ e.currentTarget.setPointerCapture(e.pointerId);
40
+ };
41
+
42
+ const handlePointerMove = (e) => {
43
+ if (!isDragging) return;
44
+ const dx = e.clientX - dragRef.current.startX;
45
+ const dy = e.clientY - dragRef.current.startY;
46
+ setPosition({
47
+ x: dragRef.current.initialX + dx,
48
+ y: dragRef.current.initialY + dy
49
+ });
50
+ };
51
+
52
+ const handlePointerUp = (e) => {
53
+ setIsDragging(false);
54
+ e.currentTarget.releasePointerCapture(e.pointerId);
55
+ };
56
+
57
+ const handleSend = async (e) => {
58
+ e.preventDefault();
59
+ if (!input.trim()) return;
60
+
61
+ const userMessage = { role: 'user', content: input };
62
+ setMessages(prev => [...prev, userMessage]);
63
+ setInput('');
64
+ setIsLoading(true);
65
+
66
+ try {
67
+ const response = await axios.post(`${API_BASE_URL}/api/chat`, {
68
+ messages: [...messages, userMessage].map(m => ({ role: m.role, content: m.content }))
69
+ });
70
+
71
+ if (response.data.response) {
72
+ setMessages(prev => [...prev, { role: 'model', content: response.data.response }]);
73
+ } else if (response.data.error) {
74
+ setMessages(prev => [...prev, { role: 'model', content: `Error: ${response.data.error}` }]);
75
+ }
76
+ } catch (error) {
77
+ console.error('Chat error:', error);
78
+ setMessages(prev => [...prev, { role: 'model', content: "GRRR... I couldn't reach the server. Is it running?" }]);
79
+ } finally {
80
+ setIsLoading(false);
81
+ }
82
+ };
83
+
84
+ if (!isOpen) return null;
85
+
86
+ return (
87
+ <div
88
+ className="chat-widget-container"
89
+ style={{ left: position.x, top: position.y, margin: 0 }}
90
+ >
91
+ <div
92
+ className="chat-header"
93
+ onPointerDown={handlePointerDown}
94
+ onPointerMove={handlePointerMove}
95
+ onPointerUp={handlePointerUp}
96
+ >
97
+ <h3>🦍 Gorilla Bot</h3>
98
+ <button className="chat-close-btn" onClick={onClose}>×</button>
99
+ </div>
100
+ <div className="chat-messages">
101
+ {messages.map((msg, idx) => (
102
+ <div key={idx} className={`chat-message ${msg.role}`}>
103
+ <div className="chat-bubble">
104
+ {msg.content}
105
+ </div>
106
+ </div>
107
+ ))}
108
+ {isLoading && (
109
+ <div className="chat-message model">
110
+ <div className="chat-bubble loading">
111
+ Thinking... 🍌
112
+ </div>
113
+ </div>
114
+ )}
115
+ <div ref={messagesEndRef} />
116
+ </div>
117
+ <form className="chat-input-area" onSubmit={handleSend}>
118
+ <input
119
+ type="text"
120
+ value={input}
121
+ onChange={(e) => setInput(e.target.value)}
122
+ placeholder="Ask about defects, KPIs, or forecasts..."
123
+ className="chat-input"
124
+ />
125
+ <button type="submit" className="chat-send-btn" disabled={isLoading}>
126
+ Send
127
+ </button>
128
+ </form>
129
+ </div>
130
+ );
131
+ };
frontend/src/components/HistoricalAnalytics.jsx ADDED
@@ -0,0 +1,150 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import React, { useState, useEffect } from 'react';
2
+ import axios from 'axios';
3
+ import { Chart as ChartJS, ArcElement, Tooltip, Legend, CategoryScale, LinearScale, PointElement, LineElement, BarElement } from 'chart.js';
4
+ import { Pie, Bar, Line } from 'react-chartjs-2';
5
+ import KPICard from './KPICard';
6
+
7
+ import API_BASE_URL from '../apiConfig';
8
+
9
+ ChartJS.register(ArcElement, Tooltip, Legend, CategoryScale, LinearScale, PointElement, LineElement, BarElement);
10
+
11
+ const COLORS = {
12
+ accent: '#f472b6',
13
+ accent2: '#38bdf8',
14
+ accent3: '#4ade80',
15
+ danger: '#fb7185',
16
+ warning: '#fbbf24',
17
+ text: '#000000',
18
+ textMuted: '#3f3f46'
19
+ };
20
+
21
+ const DEFECT_COLORS = {
22
+ 'Center': '#ef4444', 'Donut': '#f59e0b', 'Edge-Loc': '#10b981',
23
+ 'Edge-Ring': '#3b82f6', 'Loc': '#8b5cf6', 'Random': '#ec4899',
24
+ 'Scratch': '#06b6d4', 'Near-full': '#f97316', 'None': '#6b7280',
25
+ 'Undetected': '#374151'
26
+ };
27
+
28
+ const chartOptions = {
29
+ color: COLORS.text,
30
+ plugins: {
31
+ legend: {
32
+ labels: { color: COLORS.textMuted }
33
+ }
34
+ },
35
+ scales: {
36
+ x: { ticks: { color: COLORS.textMuted }, grid: { color: 'rgba(0,0,0,0.1)' } },
37
+ y: { ticks: { color: COLORS.textMuted }, grid: { color: 'rgba(0,0,0,0.1)' } }
38
+ }
39
+ };
40
+ const pieOptions = {
41
+ color: COLORS.text,
42
+ plugins: { legend: { labels: { color: COLORS.textMuted } } }
43
+ };
44
+
45
+ export const HistoricalAnalytics = () => {
46
+ const [kpis, setKpis] = useState(null);
47
+ const [defects, setDefects] = useState(null);
48
+ const [waste, setWaste] = useState(null);
49
+ const [trends, setTrends] = useState(null);
50
+
51
+ useEffect(() => {
52
+ const fetchData = async () => {
53
+ const [kRes, dRes, wRes, tRes] = await Promise.all([
54
+ axios.get(`${API_BASE_URL}/api/kpi`),
55
+ axios.get(`${API_BASE_URL}/api/charts/defects`),
56
+ axios.get(`${API_BASE_URL}/api/charts/waste`),
57
+ axios.get(`${API_BASE_URL}/api/charts/trends`)
58
+ ]);
59
+ setKpis(kRes.data);
60
+ setDefects(dRes.data);
61
+ setWaste(wRes.data);
62
+ setTrends(tRes.data);
63
+ };
64
+ fetchData();
65
+ }, []);
66
+
67
+ if (!kpis || !defects || !waste || !trends) return <div>Loading Analytics...</div>;
68
+
69
+ const pieData = {
70
+ labels: defects.predictions.map(d => d.defect_type),
71
+ datasets: [{
72
+ data: defects.predictions.map(d => d.count),
73
+ backgroundColor: defects.predictions.map(d => DEFECT_COLORS[d.defect_type] || COLORS.textMuted),
74
+ borderColor: 'transparent'
75
+ }]
76
+ };
77
+
78
+ const barData = {
79
+ labels: waste.waste_by_type.map(w => w.defect_type),
80
+ datasets: [{
81
+ label: 'Total Material Waste (Wafers)',
82
+ data: waste.waste_by_type.map(w => w.total_waste),
83
+ backgroundColor: waste.waste_by_type.map(w => DEFECT_COLORS[w.defect_type] || COLORS.textMuted)
84
+ }]
85
+ };
86
+
87
+ const trendData = {
88
+ labels: trends.dates,
89
+ datasets: [{
90
+ label: 'Fail Rate %',
91
+ data: trends.fail_rate,
92
+ borderColor: COLORS.danger,
93
+ backgroundColor: 'rgba(251, 113, 133, 0.4)',
94
+ fill: true,
95
+ yAxisID: 'y'
96
+ }]
97
+ };
98
+
99
+ const wasteTrendData = {
100
+ labels: trends.dates,
101
+ datasets: [{
102
+ label: 'Total Lost Wafers',
103
+ data: trends.waste,
104
+ borderColor: COLORS.warning,
105
+ backgroundColor: 'rgba(251, 191, 36, 0.4)',
106
+ fill: true,
107
+ yAxisID: 'y'
108
+ }]
109
+ };
110
+
111
+ return (
112
+ <div style={{ display: 'flex', flexDirection: 'column', height: '100%' }}>
113
+ <div className="kpi-container">
114
+ <KPICard title="Total Scans" value={kpis.total_scans.toLocaleString()} subtitle="wafers inspected" color={COLORS.accent} />
115
+ <KPICard title="Pass Rate" value={`${kpis.pass_rate}%`} subtitle={`${kpis.pass_count.toLocaleString()} passed`} color={COLORS.accent3} />
116
+ <KPICard title="Fail Rate" value={`${kpis.fail_rate}%`} subtitle={`${kpis.fail_count.toLocaleString()} defective`} color={COLORS.danger} />
117
+ <KPICard title="Scrapped" value={kpis.scrap_count.toLocaleString()} subtitle="routed to scrap" color={COLORS.warning} />
118
+ <KPICard title="Avg Waste/Wafer" value={`${kpis.avg_waste}%`} subtitle="per defective wafer" color={COLORS.danger} />
119
+ <KPICard title="Avg Confidence" value={kpis.avg_confidence} subtitle="model certainty" color={COLORS.accent3} />
120
+ </div>
121
+
122
+ <div className="charts-master-grid">
123
+ <div className="glass-card chart-card">
124
+ <h3 style={{marginTop:0, marginBottom: '8px', fontSize: '14px'}}>YOLOv8 Predicted Distributions</h3>
125
+ <div className="canvas-container">
126
+ <Pie data={pieData} options={{...pieOptions, maintainAspectRatio: false}} />
127
+ </div>
128
+ </div>
129
+ <div className="glass-card chart-card">
130
+ <h3 style={{marginTop:0, marginBottom: '8px', fontSize: '14px'}}>Total Material Waste by Predict Defect</h3>
131
+ <div className="canvas-container">
132
+ <Bar data={barData} options={{...chartOptions, maintainAspectRatio: false}} />
133
+ </div>
134
+ </div>
135
+ <div className="glass-card chart-card">
136
+ <h3 style={{marginTop:0, marginBottom: '8px', fontSize: '14px'}}>Daily Defect Rate Over Time</h3>
137
+ <div className="canvas-container">
138
+ <Line data={trendData} options={{...chartOptions, maintainAspectRatio: false}} />
139
+ </div>
140
+ </div>
141
+ <div className="glass-card chart-card">
142
+ <h3 style={{marginTop:0, marginBottom: '8px', fontSize: '14px'}}>Daily Material Waste Over Time</h3>
143
+ <div className="canvas-container">
144
+ <Line data={wasteTrendData} options={{...chartOptions, maintainAspectRatio: false}} />
145
+ </div>
146
+ </div>
147
+ </div>
148
+ </div>
149
+ );
150
+ };
frontend/src/components/KPICard.jsx ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import React from 'react';
2
+ import '../index.css';
3
+
4
+ const KPICard = ({ title, value, subtitle, color }) => {
5
+ return (
6
+ <div className="kpi-card">
7
+ <p className="kpi-title">{title}</p>
8
+ <h2 className="kpi-value" style={{ color: color }}>{value}</h2>
9
+ {subtitle && <p className="kpi-subtitle">{subtitle}</p>}
10
+ </div>
11
+ );
12
+ };
13
+
14
+ export default KPICard;
frontend/src/components/MaterialPredictor.jsx ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import React, { useState, useEffect } from 'react';
2
+ import axios from 'axios';
3
+ import KPICard from './KPICard';
4
+
5
+ import API_BASE_URL from '../apiConfig';
6
+
7
+ export const MaterialPredictor = () => {
8
+ const [scans, setScans] = useState(1300);
9
+ const [failRate, setFailRate] = useState(97);
10
+ const [prediction, setPrediction] = useState(null);
11
+ const [modelStatus, setModelStatus] = useState(null);
12
+
13
+ useEffect(() => {
14
+ axios.get(`${API_BASE_URL}/api/model/status`).then(res => {
15
+ setModelStatus(res.data);
16
+ }).catch(() => setModelStatus({loaded: false}));
17
+ }, []);
18
+
19
+ const handlePredict = async () => {
20
+ try {
21
+ const res = await axios.post(`${API_BASE_URL}/api/predict`, { scans, fail_rate: failRate });
22
+ setPrediction(res.data);
23
+ } catch(e) {
24
+ console.error(e);
25
+ }
26
+ };
27
+
28
+ if (!modelStatus) return <div>Loading...</div>;
29
+
30
+ return (
31
+ <div>
32
+ <div className="glass-card predictor-header">
33
+ <div>
34
+ <h3 className="predictor-title">Prediction Model</h3>
35
+ <p style={{margin:0, color: modelStatus.loaded ? 'var(--accent3)' : 'var(--danger)'}}>
36
+ {modelStatus.loaded ? 'Model loaded' : 'No model found'}
37
+ </p>
38
+ </div>
39
+ {modelStatus.loaded && (
40
+ <p style={{color: 'var(--text-muted)', fontFamily:'monospace'}}>
41
+ R² = {modelStatus.metrics.r2} | MAE = {modelStatus.metrics.mae}%
42
+ </p>
43
+ )}
44
+ </div>
45
+
46
+ <div className="glass-card">
47
+ <h3 className="predictor-title" style={{marginBottom: '20px'}}>Forecast Parameters</h3>
48
+ <div className="predictor-form">
49
+ <div className="slider-container">
50
+ <label>Expected Daily Production ({scans} wafers)</label>
51
+ <input type="range" min="100" max="2000" step="50" value={scans} onChange={(e) => setScans(parseInt(e.target.value))} />
52
+ </div>
53
+ <div className="slider-container">
54
+ <label>Expected Defect Rate ({failRate}%)</label>
55
+ <input type="range" min="0" max="100" step="5" value={failRate} onChange={(e) => setFailRate(parseInt(e.target.value))} />
56
+ </div>
57
+ </div>
58
+ <button className="btn-primary" onClick={handlePredict}>Predict Material Needs</button>
59
+ </div>
60
+
61
+ {prediction && (
62
+ <div className="glass-card prediction-result">
63
+ <div className="kpi-container" style={{marginBottom: 0}}>
64
+ <KPICard title="Daily Production" value={prediction.total_scans} subtitle="wafers" color="var(--accent)" />
65
+ <KPICard title="Expected Defect Rate" value={`${prediction.fail_rate}%`} subtitle={`~${Math.round((prediction.total_scans*prediction.fail_rate)/100)} defective`} color="var(--danger)" />
66
+ <KPICard title="Avg Waste/Wafer" value={`${prediction.avg_waste_per_wafer}%`} subtitle="per defective wafer loss" color="var(--warning)" />
67
+ <KPICard title="Total Daily Waste" value={`${prediction.total_daily_waste} wafers`} subtitle="total estimated loss" color="var(--danger)" />
68
+ </div>
69
+ </div>
70
+ )}
71
+ </div>
72
+ );
73
+ };
frontend/src/index.css ADDED
@@ -0,0 +1,490 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ :root {
2
+ --bg: #fbbf24; /* Banana Yellow */
3
+ --card: #ffffff;
4
+ --card-border: #000000;
5
+ --accent: #f472b6; /* Bubblegum Pink */
6
+ --accent2: #38bdf8; /* Sky Blue */
7
+ --accent3: #4ade80; /* Jungle Green */
8
+ --danger: #fb7185;
9
+ --warning: #fbcfe8;
10
+ --text: #000000;
11
+ --text-muted: #3f3f46;
12
+ --font-family: 'Fredoka', sans-serif;
13
+ }
14
+
15
+ body {
16
+ margin: 0;
17
+ padding: 0;
18
+ background-color: var(--bg);
19
+ color: var(--text);
20
+ font-family: var(--font-family);
21
+ -webkit-font-smoothing: antialiased;
22
+ -moz-osx-font-smoothing: grayscale;
23
+ }
24
+
25
+ #root {
26
+ height: 100vh;
27
+ padding: 12px 16px;
28
+ box-sizing: border-box;
29
+ display: flex;
30
+ flex-direction: column;
31
+ overflow: hidden;
32
+ }
33
+
34
+ .dashboard-header {
35
+ display: flex;
36
+ flex-direction: column;
37
+ margin-bottom: 12px;
38
+ border-bottom: 4px solid var(--card-border);
39
+ padding-bottom: 8px;
40
+ flex-shrink: 0;
41
+ background-color: var(--card);
42
+ border-radius: 12px;
43
+ padding: 8px 16px;
44
+ box-shadow: 4px 4px 0px #000000;
45
+ border: 3px solid #000000;
46
+ }
47
+
48
+ .header-title-container {
49
+ display: flex;
50
+ align-items: center;
51
+ gap: 12px;
52
+ }
53
+
54
+ .header-dot {
55
+ width: 12px;
56
+ height: 12px;
57
+ border-radius: 50%;
58
+ background-color: var(--accent3);
59
+ box-shadow: 0 0 8px var(--accent3);
60
+ animation: pulse 2s infinite;
61
+ }
62
+
63
+ @keyframes pulse {
64
+ 0% { box-shadow: 0 0 8px var(--accent3); }
65
+ 50% { box-shadow: 0 0 16px var(--accent3); }
66
+ 100% { box-shadow: 0 0 8px var(--accent3); }
67
+ }
68
+
69
+ .header-title {
70
+ margin: 0;
71
+ font-size: 26px;
72
+ font-weight: 700;
73
+ color: var(--text);
74
+ letter-spacing: 1px;
75
+ }
76
+
77
+
78
+ .gorilla-logo {
79
+ height: 48px;
80
+ width: auto;
81
+ border-radius: 8px;
82
+ object-fit: contain;
83
+ cursor: pointer;
84
+ transform-origin: center bottom;
85
+ transition: transform 0.2s cubic-bezier(0.34, 1.56, 0.64, 1);
86
+ }
87
+
88
+ .gorilla-logo:hover {
89
+ transform: scale(1.15);
90
+ }
91
+
92
+ @keyframes fadeInOverlay {
93
+ 0% { opacity: 0; }
94
+ 100% { opacity: 1; }
95
+ }
96
+
97
+ @keyframes fadeInScale {
98
+ 0% { opacity: 0; transform: scale(0.5); }
99
+ 100% { opacity: 1; transform: scale(1); }
100
+ }
101
+
102
+ .thump-video-overlay {
103
+ position: fixed;
104
+ top: 0;
105
+ left: 0;
106
+ width: 100vw;
107
+ height: 100vh;
108
+ background: rgba(0, 0, 0, 0.5);
109
+ display: flex;
110
+ justify-content: center;
111
+ align-items: center;
112
+ z-index: 9999;
113
+ animation: fadeInOverlay 0.4s ease-out forwards;
114
+ }
115
+
116
+ .thump-video {
117
+ width: 250px;
118
+ height: auto;
119
+ border-radius: 20px;
120
+ border: 6px solid #000000;
121
+ box-shadow: 8px 8px 0px #000000;
122
+ animation: fadeInScale 0.4s cubic-bezier(0.34, 1.56, 0.64, 1) forwards;
123
+ }
124
+
125
+ .header-subtitle {
126
+ color: var(--text-muted);
127
+ margin-top: 4px;
128
+ font-size: 13px;
129
+ }
130
+
131
+ /* Tabs */
132
+ .tabs-container {
133
+ display: flex;
134
+ gap: 12px;
135
+ margin-bottom: 16px;
136
+ flex-shrink: 0;
137
+ }
138
+
139
+ .tab-btn {
140
+ background-color: var(--card);
141
+ color: var(--text);
142
+ border: 3px solid var(--card-border);
143
+ border-bottom: none;
144
+ border-radius: 16px 16px 0 0;
145
+ padding: 12px 24px;
146
+ font-weight: 600;
147
+ font-size: 16px;
148
+ cursor: pointer;
149
+ transition: all 0.2s cubic-bezier(0.34, 1.56, 0.64, 1);
150
+ outline: none;
151
+ font-family: var(--font-family);
152
+ box-shadow: inset 0 -4px 0 rgba(0,0,0,0.1);
153
+ }
154
+
155
+ .tab-btn:hover {
156
+ background-color: var(--accent);
157
+ transform: translateY(-2px);
158
+ }
159
+
160
+ .tab-btn.active {
161
+ background-color: var(--accent2);
162
+ color: #000000;
163
+ border-color: #000000;
164
+ }
165
+
166
+ /* KPI Container */
167
+ .kpi-container {
168
+ display: flex;
169
+ gap: 8px;
170
+ flex-wrap: nowrap;
171
+ margin-bottom: 16px;
172
+ flex-shrink: 0;
173
+ }
174
+
175
+ /* KPI Card */
176
+ .kpi-card {
177
+ background: var(--card);
178
+ border: 3px solid var(--card-border);
179
+ border-radius: 16px;
180
+ padding: 10px;
181
+ text-align: center;
182
+ flex: 1;
183
+ min-width: 0;
184
+ transition: all 0.2s cubic-bezier(0.34, 1.56, 0.64, 1);
185
+ box-shadow: 4px 4px 0px #000000;
186
+ }
187
+
188
+ .kpi-card:hover {
189
+ transform: scale(1.05) rotate(-2deg);
190
+ box-shadow: 8px 8px 0px #000000;
191
+ background-color: var(--warning);
192
+ }
193
+
194
+ .kpi-title {
195
+ color: var(--text-muted);
196
+ font-size: 11px;
197
+ margin-bottom: 2px;
198
+ font-weight: 500;
199
+ text-transform: uppercase;
200
+ letter-spacing: 1px;
201
+ }
202
+
203
+ .kpi-value {
204
+ font-size: 20px;
205
+ font-weight: 700;
206
+ margin: 4px 0;
207
+ }
208
+
209
+ .kpi-subtitle {
210
+ color: var(--text-muted);
211
+ font-size: 10px;
212
+ margin-top: 2px;
213
+ }
214
+
215
+ /* Base Card */
216
+ .glass-card {
217
+ background: var(--card);
218
+ border: 3px solid var(--card-border);
219
+ border-radius: 16px;
220
+ padding: 12px;
221
+ margin-bottom: 0;
222
+ box-shadow: 6px 6px 0px #000000;
223
+ transition: transform 0.2s cubic-bezier(0.34, 1.56, 0.64, 1);
224
+ }
225
+ .glass-card h3 {
226
+ font-weight: 700;
227
+ letter-spacing: 0.5px;
228
+ }
229
+
230
+ /* Charts Grid */
231
+ .charts-master-grid {
232
+ display: grid;
233
+ grid-template-columns: 1fr 1fr;
234
+ grid-template-rows: minmax(0, 1fr) minmax(0, 1fr);
235
+ gap: 12px;
236
+ flex: 1;
237
+ min-height: 0;
238
+ }
239
+
240
+ .chart-card {
241
+ display: flex;
242
+ flex-direction: column;
243
+ }
244
+
245
+ .canvas-container {
246
+ flex: 1;
247
+ min-height: 0;
248
+ position: relative;
249
+ }
250
+
251
+ @media (max-width: 900px) {
252
+ .charts-master-grid {
253
+ grid-template-columns: 1fr;
254
+ grid-template-rows: auto;
255
+ overflow-y: auto;
256
+ }
257
+ #root {
258
+ height: auto;
259
+ overflow: visible;
260
+ }
261
+ }
262
+
263
+ /* Material Predictor */
264
+ .predictor-header {
265
+ display: flex;
266
+ justify-content: space-between;
267
+ align-items: center;
268
+ }
269
+
270
+ .predictor-title {
271
+ margin: 0 0 4px 0;
272
+ font-size: 18px;
273
+ }
274
+
275
+ .predictor-form {
276
+ display: grid;
277
+ grid-template-columns: 1fr 1fr;
278
+ gap: 24px;
279
+ margin-bottom: 20px;
280
+ }
281
+
282
+ .slider-container label {
283
+ display: block;
284
+ color: var(--text-muted);
285
+ font-size: 13px;
286
+ margin-bottom: 8px;
287
+ }
288
+
289
+ .slider-container input[type="range"] {
290
+ width: 100%;
291
+ accent-color: var(--accent2);
292
+ }
293
+
294
+ .btn-primary {
295
+ background-color: var(--accent);
296
+ color: #000000;
297
+ border: 3px solid #000000;
298
+ border-radius: 16px;
299
+ padding: 12px 32px;
300
+ font-size: 18px;
301
+ font-weight: 700;
302
+ cursor: pointer;
303
+ width: 100%;
304
+ transition: all 0.2s cubic-bezier(0.34, 1.56, 0.64, 1);
305
+ box-shadow: 4px 4px 0px #000000;
306
+ font-family: var(--font-family);
307
+ }
308
+
309
+ .btn-primary:hover {
310
+ background-color: var(--accent2);
311
+ transform: translate(-2px, -2px);
312
+ box-shadow: 6px 6px 0px #000000;
313
+ }
314
+ .btn-primary:active {
315
+ transform: translate(2px, 2px);
316
+ box-shadow: 0px 0px 0px #000000;
317
+ }
318
+
319
+ .prediction-result {
320
+ background: var(--accent3);
321
+ }
322
+
323
+ /* ChatBot Styles */
324
+ .chat-modal-overlay {
325
+ position: fixed;
326
+ top: 0;
327
+ left: 0;
328
+ width: 100vw;
329
+ height: 100vh;
330
+ background: rgba(0, 0, 0, 0.4);
331
+ display: flex;
332
+ justify-content: center;
333
+ align-items: center;
334
+ z-index: 10000;
335
+ animation: fadeInOverlay 0.2s ease-out forwards;
336
+ }
337
+
338
+ .chat-widget-container {
339
+ position: fixed;
340
+ width: 360px;
341
+ height: 600px;
342
+ min-width: 250px;
343
+ min-height: 300px;
344
+ background: var(--card);
345
+ border: 4px solid #000000;
346
+ border-radius: 16px;
347
+ box-shadow: 12px 12px 0px #000000;
348
+ display: flex;
349
+ flex-direction: column;
350
+ overflow: hidden;
351
+ z-index: 10000;
352
+ resize: both;
353
+ }
354
+
355
+ .chat-header {
356
+ background: var(--accent2);
357
+ padding: 16px;
358
+ border-bottom: 4px solid #000000;
359
+ display: flex;
360
+ justify-content: space-between;
361
+ align-items: center;
362
+ cursor: move;
363
+ user-select: none;
364
+ }
365
+
366
+ .chat-header h3 {
367
+ margin: 0;
368
+ font-size: 20px;
369
+ font-weight: 800;
370
+ }
371
+
372
+ .chat-close-btn {
373
+ background: var(--card);
374
+ border: 3px solid #000000;
375
+ border-radius: 50%;
376
+ width: 32px;
377
+ height: 32px;
378
+ font-size: 20px;
379
+ font-weight: bold;
380
+ cursor: pointer;
381
+ display: flex;
382
+ justify-content: center;
383
+ align-items: center;
384
+ transition: transform 0.2s;
385
+ }
386
+
387
+ .chat-close-btn:hover {
388
+ background: var(--danger);
389
+ transform: scale(1.1) rotate(90deg);
390
+ }
391
+
392
+ .chat-messages {
393
+ flex: 1;
394
+ padding: 16px;
395
+ overflow-y: auto;
396
+ display: flex;
397
+ flex-direction: column;
398
+ gap: 12px;
399
+ background: #e0f2fe;
400
+ }
401
+
402
+ .chat-message {
403
+ display: flex;
404
+ width: 100%;
405
+ }
406
+
407
+ .chat-message.user {
408
+ justify-content: flex-end;
409
+ }
410
+
411
+ .chat-message.model {
412
+ justify-content: flex-start;
413
+ }
414
+
415
+ .chat-bubble {
416
+ max-width: 80%;
417
+ padding: 12px 16px;
418
+ border: 3px solid #000000;
419
+ border-radius: 12px;
420
+ font-size: 15px;
421
+ line-height: 1.4;
422
+ box-shadow: 4px 4px 0px #000000;
423
+ white-space: pre-wrap;
424
+ }
425
+
426
+ .chat-message.user .chat-bubble {
427
+ background: var(--accent);
428
+ border-bottom-right-radius: 0;
429
+ }
430
+
431
+ .chat-message.model .chat-bubble {
432
+ background: var(--card);
433
+ border-bottom-left-radius: 0;
434
+ }
435
+
436
+ .chat-bubble.loading {
437
+ font-style: italic;
438
+ color: var(--text-muted);
439
+ }
440
+
441
+ .chat-input-area {
442
+ display: flex;
443
+ padding: 12px;
444
+ background: var(--card);
445
+ border-top: 4px solid #000000;
446
+ gap: 8px;
447
+ }
448
+
449
+ .chat-input {
450
+ flex: 1;
451
+ padding: 12px;
452
+ border: 3px solid #000000;
453
+ border-radius: 8px;
454
+ font-size: 16px;
455
+ font-family: var(--font-family);
456
+ outline: none;
457
+ }
458
+
459
+ .chat-input:focus {
460
+ border-color: var(--accent2);
461
+ }
462
+
463
+ .chat-send-btn {
464
+ background: var(--accent3);
465
+ color: #000000;
466
+ border: 3px solid #000000;
467
+ border-radius: 8px;
468
+ padding: 0 24px;
469
+ font-weight: 700;
470
+ font-size: 16px;
471
+ cursor: pointer;
472
+ font-family: var(--font-family);
473
+ box-shadow: 4px 4px 0px #000000;
474
+ transition: transform 0.1s, box-shadow 0.1s;
475
+ }
476
+
477
+ .chat-send-btn:hover:not(:disabled) {
478
+ transform: translate(-2px, -2px);
479
+ box-shadow: 6px 6px 0px #000000;
480
+ }
481
+
482
+ .chat-send-btn:active:not(:disabled) {
483
+ transform: translate(2px, 2px);
484
+ box-shadow: 0px 0px 0px #000000;
485
+ }
486
+
487
+ .chat-send-btn:disabled {
488
+ background: #ccc;
489
+ cursor: not-allowed;
490
+ }
frontend/src/main.jsx ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ import { StrictMode } from 'react'
2
+ import { createRoot } from 'react-dom/client'
3
+ import './index.css'
4
+ import App from './App.jsx'
5
+
6
+ createRoot(document.getElementById('root')).render(
7
+ <StrictMode>
8
+ <App />
9
+ </StrictMode>,
10
+ )
frontend/vite.config.js ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ import { defineConfig } from 'vite'
2
+ import react from '@vitejs/plugin-react'
3
+
4
+ // https://vite.dev/config/
5
+ export default defineConfig({
6
+ plugins: [react()],
7
+ })
middleware/EDA_wafer_control_db.ipynb ADDED
@@ -0,0 +1,604 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": 1,
6
+ "id": "2d67457f",
7
+ "metadata": {},
8
+ "outputs": [
9
+ {
10
+ "name": "stdout",
11
+ "output_type": "stream",
12
+ "text": [
13
+ " id wafer_id batch_id scan_time status \\\n",
14
+ "0 1 wafer_0 BATCH_20260317_201406 2026-02-15 20:14:17 FAIL \n",
15
+ "1 2 wafer_1 BATCH_20260317_201406 2026-02-15 20:14:57 FAIL \n",
16
+ "2 3 wafer_2 BATCH_20260317_201406 2026-02-15 20:14:59 FAIL \n",
17
+ "3 4 wafer_3 BATCH_20260317_201406 2026-02-15 20:14:21 FAIL \n",
18
+ "4 5 wafer_4 BATCH_20260317_201406 2026-02-15 20:14:36 FAIL \n",
19
+ "\n",
20
+ " ground_truth defect_type action confidence \\\n",
21
+ "0 Center+Edge-Loc+Random Edge-Ring MOVE_TO_MICRO_STAGE 0.90 \n",
22
+ "1 Center+Edge-Loc+Random Center ROUTE_TO_SCRAP 0.96 \n",
23
+ "2 Center+Edge-Loc+Random Center ROUTE_TO_SCRAP 0.88 \n",
24
+ "3 Center+Edge-Loc+Random Edge-Ring MOVE_TO_MICRO_STAGE 0.91 \n",
25
+ "4 Center+Edge-Loc+Random Center ROUTE_TO_SCRAP 0.85 \n",
26
+ "\n",
27
+ " roi_coordinates defect_area_px material_wasted_pct \n",
28
+ "0 [1, 0, 51, 50] 2500 92.46 \n",
29
+ "1 [1, 0, 51, 50] 2500 92.46 \n",
30
+ "2 [1, 0, 51, 50] 2500 92.46 \n",
31
+ "3 [1, 0, 51, 50] 2500 92.46 \n",
32
+ "4 [1, 0, 52, 51] 2601 96.19 \n"
33
+ ]
34
+ }
35
+ ],
36
+ "source": [
37
+ "import pandas as pd\n",
38
+ "import sqlite3\n",
39
+ "\n",
40
+ "# 1. Connect to the database file\n",
41
+ "conn = sqlite3.connect('/Users/udayan/CHITS_PR_1/middleware/wafer_control.db')\n",
42
+ "\n",
43
+ "# 2. Write a query to select the data you want\n",
44
+ "query = \"SELECT * FROM wafer_logs\"\n",
45
+ "\n",
46
+ "# 3. Load the data into a DataFrame\n",
47
+ "df = pd.read_sql_query(query, conn)\n",
48
+ "\n",
49
+ "# 4. Close the connection\n",
50
+ "conn.close()\n",
51
+ "\n",
52
+ "# View the first few rows\n",
53
+ "print(df.head())"
54
+ ]
55
+ },
56
+ {
57
+ "cell_type": "code",
58
+ "execution_count": 3,
59
+ "id": "b287cad1",
60
+ "metadata": {},
61
+ "outputs": [
62
+ {
63
+ "data": {
64
+ "text/html": [
65
+ "<div>\n",
66
+ "<style scoped>\n",
67
+ " .dataframe tbody tr th:only-of-type {\n",
68
+ " vertical-align: middle;\n",
69
+ " }\n",
70
+ "\n",
71
+ " .dataframe tbody tr th {\n",
72
+ " vertical-align: top;\n",
73
+ " }\n",
74
+ "\n",
75
+ " .dataframe thead th {\n",
76
+ " text-align: right;\n",
77
+ " }\n",
78
+ "</style>\n",
79
+ "<table border=\"1\" class=\"dataframe\">\n",
80
+ " <thead>\n",
81
+ " <tr style=\"text-align: right;\">\n",
82
+ " <th></th>\n",
83
+ " <th>id</th>\n",
84
+ " <th>wafer_id</th>\n",
85
+ " <th>batch_id</th>\n",
86
+ " <th>scan_time</th>\n",
87
+ " <th>status</th>\n",
88
+ " <th>defect_type</th>\n",
89
+ " <th>action</th>\n",
90
+ " <th>confidence</th>\n",
91
+ " <th>roi_coordinates</th>\n",
92
+ " <th>defect_area_px</th>\n",
93
+ " <th>material_wasted_pct</th>\n",
94
+ " </tr>\n",
95
+ " </thead>\n",
96
+ " <tbody>\n",
97
+ " <tr>\n",
98
+ " <th>0</th>\n",
99
+ " <td>1</td>\n",
100
+ " <td>wafer_100574</td>\n",
101
+ " <td>BATCH_20260317_021451</td>\n",
102
+ " <td>2026-02-15 02:15:48</td>\n",
103
+ " <td>FAIL</td>\n",
104
+ " <td>Edge-Ring</td>\n",
105
+ " <td>MOVE_TO_MICRO_STAGE</td>\n",
106
+ " <td>0.95</td>\n",
107
+ " <td>[1, 0, 115, 136]</td>\n",
108
+ " <td>15504</td>\n",
109
+ " <td>97.56</td>\n",
110
+ " </tr>\n",
111
+ " <tr>\n",
112
+ " <th>1</th>\n",
113
+ " <td>2</td>\n",
114
+ " <td>wafer_101787</td>\n",
115
+ " <td>BATCH_20260317_021451</td>\n",
116
+ " <td>2026-02-15 02:15:09</td>\n",
117
+ " <td>FAIL</td>\n",
118
+ " <td>Center</td>\n",
119
+ " <td>ROUTE_TO_SCRAP</td>\n",
120
+ " <td>0.96</td>\n",
121
+ " <td>[0, 0, 43, 43]</td>\n",
122
+ " <td>1849</td>\n",
123
+ " <td>95.51</td>\n",
124
+ " </tr>\n",
125
+ " <tr>\n",
126
+ " <th>2</th>\n",
127
+ " <td>3</td>\n",
128
+ " <td>wafer_103333</td>\n",
129
+ " <td>BATCH_20260317_021451</td>\n",
130
+ " <td>2026-02-15 02:14:59</td>\n",
131
+ " <td>FAIL</td>\n",
132
+ " <td>Edge-Ring</td>\n",
133
+ " <td>MOVE_TO_MICRO_STAGE</td>\n",
134
+ " <td>0.98</td>\n",
135
+ " <td>[0, 0, 43, 43]</td>\n",
136
+ " <td>1849</td>\n",
137
+ " <td>95.51</td>\n",
138
+ " </tr>\n",
139
+ " <tr>\n",
140
+ " <th>3</th>\n",
141
+ " <td>4</td>\n",
142
+ " <td>wafer_106281</td>\n",
143
+ " <td>BATCH_20260317_021451</td>\n",
144
+ " <td>2026-02-15 02:15:19</td>\n",
145
+ " <td>FAIL</td>\n",
146
+ " <td>Loc</td>\n",
147
+ " <td>MOVE_TO_MICRO_STAGE</td>\n",
148
+ " <td>0.60</td>\n",
149
+ " <td>[0, 0, 30, 34]</td>\n",
150
+ " <td>1020</td>\n",
151
+ " <td>94.01</td>\n",
152
+ " </tr>\n",
153
+ " <tr>\n",
154
+ " <th>4</th>\n",
155
+ " <td>5</td>\n",
156
+ " <td>wafer_106301</td>\n",
157
+ " <td>BATCH_20260317_021451</td>\n",
158
+ " <td>2026-02-15 02:15:19</td>\n",
159
+ " <td>FAIL</td>\n",
160
+ " <td>Loc</td>\n",
161
+ " <td>MOVE_TO_MICRO_STAGE</td>\n",
162
+ " <td>0.96</td>\n",
163
+ " <td>[0, 0, 29, 33]</td>\n",
164
+ " <td>957</td>\n",
165
+ " <td>88.20</td>\n",
166
+ " </tr>\n",
167
+ " <tr>\n",
168
+ " <th>...</th>\n",
169
+ " <td>...</td>\n",
170
+ " <td>...</td>\n",
171
+ " <td>...</td>\n",
172
+ " <td>...</td>\n",
173
+ " <td>...</td>\n",
174
+ " <td>...</td>\n",
175
+ " <td>...</td>\n",
176
+ " <td>...</td>\n",
177
+ " <td>...</td>\n",
178
+ " <td>...</td>\n",
179
+ " <td>...</td>\n",
180
+ " </tr>\n",
181
+ " <tr>\n",
182
+ " <th>5099</th>\n",
183
+ " <td>5100</td>\n",
184
+ " <td>wafer_95994</td>\n",
185
+ " <td>BATCH_20260317_021451</td>\n",
186
+ " <td>2026-03-16 02:15:15</td>\n",
187
+ " <td>FAIL</td>\n",
188
+ " <td>Center</td>\n",
189
+ " <td>ROUTE_TO_SCRAP</td>\n",
190
+ " <td>0.96</td>\n",
191
+ " <td>[0, 0, 60, 40]</td>\n",
192
+ " <td>2400</td>\n",
193
+ " <td>93.68</td>\n",
194
+ " </tr>\n",
195
+ " <tr>\n",
196
+ " <th>5100</th>\n",
197
+ " <td>5101</td>\n",
198
+ " <td>wafer_96083</td>\n",
199
+ " <td>BATCH_20260317_021451</td>\n",
200
+ " <td>2026-03-17 02:15:17</td>\n",
201
+ " <td>FAIL</td>\n",
202
+ " <td>Center</td>\n",
203
+ " <td>ROUTE_TO_SCRAP</td>\n",
204
+ " <td>0.96</td>\n",
205
+ " <td>[0, 0, 60, 40]</td>\n",
206
+ " <td>2400</td>\n",
207
+ " <td>93.68</td>\n",
208
+ " </tr>\n",
209
+ " <tr>\n",
210
+ " <th>5101</th>\n",
211
+ " <td>5102</td>\n",
212
+ " <td>wafer_9637</td>\n",
213
+ " <td>BATCH_20260317_021451</td>\n",
214
+ " <td>2026-03-17 02:14:52</td>\n",
215
+ " <td>FAIL</td>\n",
216
+ " <td>Edge-Loc</td>\n",
217
+ " <td>MOVE_TO_MICRO_STAGE</td>\n",
218
+ " <td>0.98</td>\n",
219
+ " <td>[0, 0, 28, 31]</td>\n",
220
+ " <td>868</td>\n",
221
+ " <td>90.70</td>\n",
222
+ " </tr>\n",
223
+ " <tr>\n",
224
+ " <th>5102</th>\n",
225
+ " <td>5103</td>\n",
226
+ " <td>wafer_96594</td>\n",
227
+ " <td>BATCH_20260317_021451</td>\n",
228
+ " <td>2026-03-17 02:15:22</td>\n",
229
+ " <td>FAIL</td>\n",
230
+ " <td>Loc</td>\n",
231
+ " <td>MOVE_TO_MICRO_STAGE</td>\n",
232
+ " <td>0.81</td>\n",
233
+ " <td>[0, 0, 30, 29]</td>\n",
234
+ " <td>870</td>\n",
235
+ " <td>90.53</td>\n",
236
+ " </tr>\n",
237
+ " <tr>\n",
238
+ " <th>5103</th>\n",
239
+ " <td>5104</td>\n",
240
+ " <td>wafer_983</td>\n",
241
+ " <td>BATCH_20260317_021451</td>\n",
242
+ " <td>2026-03-17 02:15:00</td>\n",
243
+ " <td>FAIL</td>\n",
244
+ " <td>Edge-Loc</td>\n",
245
+ " <td>MOVE_TO_MICRO_STAGE</td>\n",
246
+ " <td>0.69</td>\n",
247
+ " <td>[0, 0, 25, 25]</td>\n",
248
+ " <td>625</td>\n",
249
+ " <td>92.46</td>\n",
250
+ " </tr>\n",
251
+ " </tbody>\n",
252
+ "</table>\n",
253
+ "<p>5104 rows × 11 columns</p>\n",
254
+ "</div>"
255
+ ],
256
+ "text/plain": [
257
+ " id wafer_id batch_id scan_time status \\\n",
258
+ "0 1 wafer_100574 BATCH_20260317_021451 2026-02-15 02:15:48 FAIL \n",
259
+ "1 2 wafer_101787 BATCH_20260317_021451 2026-02-15 02:15:09 FAIL \n",
260
+ "2 3 wafer_103333 BATCH_20260317_021451 2026-02-15 02:14:59 FAIL \n",
261
+ "3 4 wafer_106281 BATCH_20260317_021451 2026-02-15 02:15:19 FAIL \n",
262
+ "4 5 wafer_106301 BATCH_20260317_021451 2026-02-15 02:15:19 FAIL \n",
263
+ "... ... ... ... ... ... \n",
264
+ "5099 5100 wafer_95994 BATCH_20260317_021451 2026-03-16 02:15:15 FAIL \n",
265
+ "5100 5101 wafer_96083 BATCH_20260317_021451 2026-03-17 02:15:17 FAIL \n",
266
+ "5101 5102 wafer_9637 BATCH_20260317_021451 2026-03-17 02:14:52 FAIL \n",
267
+ "5102 5103 wafer_96594 BATCH_20260317_021451 2026-03-17 02:15:22 FAIL \n",
268
+ "5103 5104 wafer_983 BATCH_20260317_021451 2026-03-17 02:15:00 FAIL \n",
269
+ "\n",
270
+ " defect_type action confidence roi_coordinates \\\n",
271
+ "0 Edge-Ring MOVE_TO_MICRO_STAGE 0.95 [1, 0, 115, 136] \n",
272
+ "1 Center ROUTE_TO_SCRAP 0.96 [0, 0, 43, 43] \n",
273
+ "2 Edge-Ring MOVE_TO_MICRO_STAGE 0.98 [0, 0, 43, 43] \n",
274
+ "3 Loc MOVE_TO_MICRO_STAGE 0.60 [0, 0, 30, 34] \n",
275
+ "4 Loc MOVE_TO_MICRO_STAGE 0.96 [0, 0, 29, 33] \n",
276
+ "... ... ... ... ... \n",
277
+ "5099 Center ROUTE_TO_SCRAP 0.96 [0, 0, 60, 40] \n",
278
+ "5100 Center ROUTE_TO_SCRAP 0.96 [0, 0, 60, 40] \n",
279
+ "5101 Edge-Loc MOVE_TO_MICRO_STAGE 0.98 [0, 0, 28, 31] \n",
280
+ "5102 Loc MOVE_TO_MICRO_STAGE 0.81 [0, 0, 30, 29] \n",
281
+ "5103 Edge-Loc MOVE_TO_MICRO_STAGE 0.69 [0, 0, 25, 25] \n",
282
+ "\n",
283
+ " defect_area_px material_wasted_pct \n",
284
+ "0 15504 97.56 \n",
285
+ "1 1849 95.51 \n",
286
+ "2 1849 95.51 \n",
287
+ "3 1020 94.01 \n",
288
+ "4 957 88.20 \n",
289
+ "... ... ... \n",
290
+ "5099 2400 93.68 \n",
291
+ "5100 2400 93.68 \n",
292
+ "5101 868 90.70 \n",
293
+ "5102 870 90.53 \n",
294
+ "5103 625 92.46 \n",
295
+ "\n",
296
+ "[5104 rows x 11 columns]"
297
+ ]
298
+ },
299
+ "execution_count": 3,
300
+ "metadata": {},
301
+ "output_type": "execute_result"
302
+ }
303
+ ],
304
+ "source": [
305
+ "df"
306
+ ]
307
+ },
308
+ {
309
+ "cell_type": "code",
310
+ "execution_count": 2,
311
+ "id": "299c2be8",
312
+ "metadata": {},
313
+ "outputs": [],
314
+ "source": [
315
+ "pass_wafers = df[df['status'] == 'PASS']"
316
+ ]
317
+ },
318
+ {
319
+ "cell_type": "code",
320
+ "execution_count": 3,
321
+ "id": "46fbb2d3",
322
+ "metadata": {},
323
+ "outputs": [
324
+ {
325
+ "data": {
326
+ "text/html": [
327
+ "<div>\n",
328
+ "<style scoped>\n",
329
+ " .dataframe tbody tr th:only-of-type {\n",
330
+ " vertical-align: middle;\n",
331
+ " }\n",
332
+ "\n",
333
+ " .dataframe tbody tr th {\n",
334
+ " vertical-align: top;\n",
335
+ " }\n",
336
+ "\n",
337
+ " .dataframe thead th {\n",
338
+ " text-align: right;\n",
339
+ " }\n",
340
+ "</style>\n",
341
+ "<table border=\"1\" class=\"dataframe\">\n",
342
+ " <thead>\n",
343
+ " <tr style=\"text-align: right;\">\n",
344
+ " <th></th>\n",
345
+ " <th>id</th>\n",
346
+ " <th>wafer_id</th>\n",
347
+ " <th>batch_id</th>\n",
348
+ " <th>scan_time</th>\n",
349
+ " <th>status</th>\n",
350
+ " <th>ground_truth</th>\n",
351
+ " <th>defect_type</th>\n",
352
+ " <th>action</th>\n",
353
+ " <th>confidence</th>\n",
354
+ " <th>roi_coordinates</th>\n",
355
+ " <th>defect_area_px</th>\n",
356
+ " <th>material_wasted_pct</th>\n",
357
+ " </tr>\n",
358
+ " </thead>\n",
359
+ " <tbody>\n",
360
+ " <tr>\n",
361
+ " <th>33866</th>\n",
362
+ " <td>33867</td>\n",
363
+ " <td>wafer_33866</td>\n",
364
+ " <td>BATCH_20260317_201406</td>\n",
365
+ " <td>2026-03-13 20:14:29</td>\n",
366
+ " <td>PASS</td>\n",
367
+ " <td>Normal</td>\n",
368
+ " <td>None</td>\n",
369
+ " <td>ROUTE_TO_ASSEMBLY</td>\n",
370
+ " <td>1.0</td>\n",
371
+ " <td>[]</td>\n",
372
+ " <td>0</td>\n",
373
+ " <td>0.0</td>\n",
374
+ " </tr>\n",
375
+ " <tr>\n",
376
+ " <th>33867</th>\n",
377
+ " <td>33868</td>\n",
378
+ " <td>wafer_33867</td>\n",
379
+ " <td>BATCH_20260317_201406</td>\n",
380
+ " <td>2026-03-13 20:14:08</td>\n",
381
+ " <td>PASS</td>\n",
382
+ " <td>Normal</td>\n",
383
+ " <td>None</td>\n",
384
+ " <td>ROUTE_TO_ASSEMBLY</td>\n",
385
+ " <td>1.0</td>\n",
386
+ " <td>[]</td>\n",
387
+ " <td>0</td>\n",
388
+ " <td>0.0</td>\n",
389
+ " </tr>\n",
390
+ " <tr>\n",
391
+ " <th>33868</th>\n",
392
+ " <td>33869</td>\n",
393
+ " <td>wafer_33868</td>\n",
394
+ " <td>BATCH_20260317_201406</td>\n",
395
+ " <td>2026-03-13 20:14:08</td>\n",
396
+ " <td>PASS</td>\n",
397
+ " <td>Normal</td>\n",
398
+ " <td>None</td>\n",
399
+ " <td>ROUTE_TO_ASSEMBLY</td>\n",
400
+ " <td>1.0</td>\n",
401
+ " <td>[]</td>\n",
402
+ " <td>0</td>\n",
403
+ " <td>0.0</td>\n",
404
+ " </tr>\n",
405
+ " <tr>\n",
406
+ " <th>33869</th>\n",
407
+ " <td>33870</td>\n",
408
+ " <td>wafer_33869</td>\n",
409
+ " <td>BATCH_20260317_201406</td>\n",
410
+ " <td>2026-03-13 20:14:29</td>\n",
411
+ " <td>PASS</td>\n",
412
+ " <td>Normal</td>\n",
413
+ " <td>None</td>\n",
414
+ " <td>ROUTE_TO_ASSEMBLY</td>\n",
415
+ " <td>1.0</td>\n",
416
+ " <td>[]</td>\n",
417
+ " <td>0</td>\n",
418
+ " <td>0.0</td>\n",
419
+ " </tr>\n",
420
+ " <tr>\n",
421
+ " <th>33870</th>\n",
422
+ " <td>33871</td>\n",
423
+ " <td>wafer_33870</td>\n",
424
+ " <td>BATCH_20260317_201406</td>\n",
425
+ " <td>2026-03-13 20:14:31</td>\n",
426
+ " <td>PASS</td>\n",
427
+ " <td>Normal</td>\n",
428
+ " <td>None</td>\n",
429
+ " <td>ROUTE_TO_ASSEMBLY</td>\n",
430
+ " <td>1.0</td>\n",
431
+ " <td>[]</td>\n",
432
+ " <td>0</td>\n",
433
+ " <td>0.0</td>\n",
434
+ " </tr>\n",
435
+ " <tr>\n",
436
+ " <th>...</th>\n",
437
+ " <td>...</td>\n",
438
+ " <td>...</td>\n",
439
+ " <td>...</td>\n",
440
+ " <td>...</td>\n",
441
+ " <td>...</td>\n",
442
+ " <td>...</td>\n",
443
+ " <td>...</td>\n",
444
+ " <td>...</td>\n",
445
+ " <td>...</td>\n",
446
+ " <td>...</td>\n",
447
+ " <td>...</td>\n",
448
+ " <td>...</td>\n",
449
+ " </tr>\n",
450
+ " <tr>\n",
451
+ " <th>823948</th>\n",
452
+ " <td>823949</td>\n",
453
+ " <td>wm811k_811442</td>\n",
454
+ " <td>BATCH_20260317_201406</td>\n",
455
+ " <td>2026-03-16 20:14:06</td>\n",
456
+ " <td>PASS</td>\n",
457
+ " <td>Normal</td>\n",
458
+ " <td>None</td>\n",
459
+ " <td>ROUTE_TO_ASSEMBLY</td>\n",
460
+ " <td>1.0</td>\n",
461
+ " <td>[]</td>\n",
462
+ " <td>0</td>\n",
463
+ " <td>0.0</td>\n",
464
+ " </tr>\n",
465
+ " <tr>\n",
466
+ " <th>823949</th>\n",
467
+ " <td>823950</td>\n",
468
+ " <td>wm811k_811445</td>\n",
469
+ " <td>BATCH_20260317_201406</td>\n",
470
+ " <td>2026-03-16 20:14:07</td>\n",
471
+ " <td>PASS</td>\n",
472
+ " <td>Normal</td>\n",
473
+ " <td>None</td>\n",
474
+ " <td>ROUTE_TO_ASSEMBLY</td>\n",
475
+ " <td>1.0</td>\n",
476
+ " <td>[]</td>\n",
477
+ " <td>0</td>\n",
478
+ " <td>0.0</td>\n",
479
+ " </tr>\n",
480
+ " <tr>\n",
481
+ " <th>823950</th>\n",
482
+ " <td>823951</td>\n",
483
+ " <td>wm811k_811449</td>\n",
484
+ " <td>BATCH_20260317_201406</td>\n",
485
+ " <td>2026-03-16 20:14:07</td>\n",
486
+ " <td>PASS</td>\n",
487
+ " <td>Normal</td>\n",
488
+ " <td>None</td>\n",
489
+ " <td>ROUTE_TO_ASSEMBLY</td>\n",
490
+ " <td>1.0</td>\n",
491
+ " <td>[]</td>\n",
492
+ " <td>0</td>\n",
493
+ " <td>0.0</td>\n",
494
+ " </tr>\n",
495
+ " <tr>\n",
496
+ " <th>823951</th>\n",
497
+ " <td>823952</td>\n",
498
+ " <td>wm811k_811455</td>\n",
499
+ " <td>BATCH_20260317_201406</td>\n",
500
+ " <td>2026-03-16 20:14:07</td>\n",
501
+ " <td>PASS</td>\n",
502
+ " <td>Normal</td>\n",
503
+ " <td>None</td>\n",
504
+ " <td>ROUTE_TO_ASSEMBLY</td>\n",
505
+ " <td>1.0</td>\n",
506
+ " <td>[]</td>\n",
507
+ " <td>0</td>\n",
508
+ " <td>0.0</td>\n",
509
+ " </tr>\n",
510
+ " <tr>\n",
511
+ " <th>823952</th>\n",
512
+ " <td>823953</td>\n",
513
+ " <td>wm811k_811456</td>\n",
514
+ " <td>BATCH_20260317_201406</td>\n",
515
+ " <td>2026-03-16 20:14:09</td>\n",
516
+ " <td>PASS</td>\n",
517
+ " <td>Normal</td>\n",
518
+ " <td>None</td>\n",
519
+ " <td>ROUTE_TO_ASSEMBLY</td>\n",
520
+ " <td>1.0</td>\n",
521
+ " <td>[]</td>\n",
522
+ " <td>0</td>\n",
523
+ " <td>0.0</td>\n",
524
+ " </tr>\n",
525
+ " </tbody>\n",
526
+ "</table>\n",
527
+ "<p>786938 rows × 12 columns</p>\n",
528
+ "</div>"
529
+ ],
530
+ "text/plain": [
531
+ " id wafer_id batch_id scan_time \\\n",
532
+ "33866 33867 wafer_33866 BATCH_20260317_201406 2026-03-13 20:14:29 \n",
533
+ "33867 33868 wafer_33867 BATCH_20260317_201406 2026-03-13 20:14:08 \n",
534
+ "33868 33869 wafer_33868 BATCH_20260317_201406 2026-03-13 20:14:08 \n",
535
+ "33869 33870 wafer_33869 BATCH_20260317_201406 2026-03-13 20:14:29 \n",
536
+ "33870 33871 wafer_33870 BATCH_20260317_201406 2026-03-13 20:14:31 \n",
537
+ "... ... ... ... ... \n",
538
+ "823948 823949 wm811k_811442 BATCH_20260317_201406 2026-03-16 20:14:06 \n",
539
+ "823949 823950 wm811k_811445 BATCH_20260317_201406 2026-03-16 20:14:07 \n",
540
+ "823950 823951 wm811k_811449 BATCH_20260317_201406 2026-03-16 20:14:07 \n",
541
+ "823951 823952 wm811k_811455 BATCH_20260317_201406 2026-03-16 20:14:07 \n",
542
+ "823952 823953 wm811k_811456 BATCH_20260317_201406 2026-03-16 20:14:09 \n",
543
+ "\n",
544
+ " status ground_truth defect_type action confidence \\\n",
545
+ "33866 PASS Normal None ROUTE_TO_ASSEMBLY 1.0 \n",
546
+ "33867 PASS Normal None ROUTE_TO_ASSEMBLY 1.0 \n",
547
+ "33868 PASS Normal None ROUTE_TO_ASSEMBLY 1.0 \n",
548
+ "33869 PASS Normal None ROUTE_TO_ASSEMBLY 1.0 \n",
549
+ "33870 PASS Normal None ROUTE_TO_ASSEMBLY 1.0 \n",
550
+ "... ... ... ... ... ... \n",
551
+ "823948 PASS Normal None ROUTE_TO_ASSEMBLY 1.0 \n",
552
+ "823949 PASS Normal None ROUTE_TO_ASSEMBLY 1.0 \n",
553
+ "823950 PASS Normal None ROUTE_TO_ASSEMBLY 1.0 \n",
554
+ "823951 PASS Normal None ROUTE_TO_ASSEMBLY 1.0 \n",
555
+ "823952 PASS Normal None ROUTE_TO_ASSEMBLY 1.0 \n",
556
+ "\n",
557
+ " roi_coordinates defect_area_px material_wasted_pct \n",
558
+ "33866 [] 0 0.0 \n",
559
+ "33867 [] 0 0.0 \n",
560
+ "33868 [] 0 0.0 \n",
561
+ "33869 [] 0 0.0 \n",
562
+ "33870 [] 0 0.0 \n",
563
+ "... ... ... ... \n",
564
+ "823948 [] 0 0.0 \n",
565
+ "823949 [] 0 0.0 \n",
566
+ "823950 [] 0 0.0 \n",
567
+ "823951 [] 0 0.0 \n",
568
+ "823952 [] 0 0.0 \n",
569
+ "\n",
570
+ "[786938 rows x 12 columns]"
571
+ ]
572
+ },
573
+ "execution_count": 3,
574
+ "metadata": {},
575
+ "output_type": "execute_result"
576
+ }
577
+ ],
578
+ "source": [
579
+ "pass_wafers"
580
+ ]
581
+ }
582
+ ],
583
+ "metadata": {
584
+ "kernelspec": {
585
+ "display_name": "venv",
586
+ "language": "python",
587
+ "name": "python3"
588
+ },
589
+ "language_info": {
590
+ "codemirror_mode": {
591
+ "name": "ipython",
592
+ "version": 3
593
+ },
594
+ "file_extension": ".py",
595
+ "mimetype": "text/x-python",
596
+ "name": "python",
597
+ "nbconvert_exporter": "python",
598
+ "pygments_lexer": "ipython3",
599
+ "version": "3.13.5"
600
+ }
601
+ },
602
+ "nbformat": 4,
603
+ "nbformat_minor": 5
604
+ }
middleware/__init__.py ADDED
File without changes
middleware/best.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f3df8caca758c1959cc0f4dfc3190c4887a8d9ad053052a685b0bd2d5fb8f502
3
+ size 6195306
middleware/dashboard.py ADDED
@@ -0,0 +1,369 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Wafer Defect Analytics Dashboard — Phase 5
3
+ Plotly Dash web application for semiconductor material waste analysis
4
+ and predictive material requirement estimation.
5
+ Now using the Mixed-type Wafer Defect Dataset (38,015 wafers).
6
+ """
7
+
8
+ import os
9
+ import pickle
10
+ import sqlite3
11
+ import numpy as np
12
+ import pandas as pd
13
+ import plotly.express as px
14
+ import plotly.graph_objects as go
15
+ from dash import Dash, html, dcc, callback, Input, Output, State
16
+
17
+ # --- CONFIGURATION ---
18
+ DB_PATH = os.path.join(os.path.dirname(__file__), 'wafer_control.db')
19
+ MODEL_PATH = os.path.join(os.path.dirname(__file__), 'material_model.pkl')
20
+
21
+ # Semiconductor-themed color palette
22
+ COLORS = {
23
+ 'bg': '#0a0e17',
24
+ 'card': '#131a2e',
25
+ 'card_border': '#1e2d4a',
26
+ 'accent': '#00d4ff',
27
+ 'accent2': '#7c3aed',
28
+ 'accent3': '#10b981',
29
+ 'danger': '#ef4444',
30
+ 'warning': '#f59e0b',
31
+ 'text': '#e2e8f0',
32
+ 'text_muted': '#94a3b8',
33
+ }
34
+
35
+ DEFECT_COLORS = {
36
+ 'Center': '#ef4444', 'Donut': '#f59e0b', 'Edge-Loc': '#10b981',
37
+ 'Edge-Ring': '#3b82f6', 'Loc': '#8b5cf6', 'Random': '#ec4899',
38
+ 'Scratch': '#06b6d4', 'Near-full': '#f97316', 'None': '#6b7280',
39
+ 'Undetected': '#374151',
40
+ }
41
+
42
+
43
+ def load_data():
44
+ conn = sqlite3.connect(DB_PATH)
45
+ df = pd.read_sql_query("SELECT * FROM wafer_logs", conn)
46
+ conn.close()
47
+ df['scan_time'] = pd.to_datetime(df['scan_time'])
48
+ df['scan_date'] = df['scan_time'].dt.date
49
+ return df
50
+
51
+
52
+ def load_model():
53
+ if os.path.exists(MODEL_PATH):
54
+ with open(MODEL_PATH, 'rb') as f:
55
+ return pickle.load(f)
56
+ return None
57
+
58
+
59
+ # --- LOAD DATA ---
60
+ df = load_data()
61
+ model_pkg = load_model()
62
+
63
+ # --- PRE-COMPUTE STATS ---
64
+ total_scans = len(df)
65
+ fail_count = len(df[df['status'] == 'FAIL'])
66
+ pass_count = len(df[df['status'] == 'PASS'])
67
+ pass_rate = round((pass_count / total_scans) * 100, 1)
68
+ scrap_count = len(df[df['action'] == 'ROUTE_TO_SCRAP'])
69
+ total_waste = round(df['material_wasted_pct'].sum(), 1)
70
+ avg_waste = round(df[df['status'] == 'FAIL']['material_wasted_pct'].mean(), 2)
71
+ avg_confidence = round(df[df['status'] == 'FAIL']['confidence'].mean(), 2)
72
+
73
+ # Daily aggregations
74
+ daily = df.groupby('scan_date').agg(
75
+ scans=('id', 'count'),
76
+ fails=('status', lambda x: (x == 'FAIL').sum()),
77
+ waste=('material_wasted_pct', lambda x: x.sum() / 100.0),
78
+ avg_waste=('material_wasted_pct', 'mean'),
79
+ ).reset_index()
80
+ daily['fail_rate'] = round((daily['fails'] / daily['scans']) * 100, 1)
81
+ daily['scan_date'] = pd.to_datetime(daily['scan_date'])
82
+
83
+ # ============================================================
84
+ # DASH APP
85
+ # ============================================================
86
+ app = Dash(__name__, suppress_callback_exceptions=True)
87
+ app.title = "Wafer Defect Analytics — Semiconductor QC Dashboard"
88
+
89
+ card_style = {
90
+ 'backgroundColor': COLORS['card'], 'border': f"1px solid {COLORS['card_border']}",
91
+ 'borderRadius': '12px', 'padding': '24px', 'marginBottom': '16px',
92
+ }
93
+
94
+ kpi_style = {
95
+ 'backgroundColor': COLORS['card'], 'border': f"1px solid {COLORS['card_border']}",
96
+ 'borderRadius': '12px', 'padding': '20px', 'textAlign': 'center', 'flex': '1', 'minWidth': '160px',
97
+ }
98
+
99
+
100
+ def make_kpi(title, value, subtitle="", color=COLORS['accent']):
101
+ return html.Div(style=kpi_style, children=[
102
+ html.P(title, style={'color': COLORS['text_muted'], 'fontSize': '12px', 'marginBottom': '4px', 'fontWeight': '500', 'textTransform': 'uppercase', 'letterSpacing': '1px'}),
103
+ html.H2(str(value), style={'color': color, 'fontSize': '28px', 'fontWeight': '700', 'margin': '4px 0'}),
104
+ html.P(subtitle, style={'color': COLORS['text_muted'], 'fontSize': '11px', 'marginTop': '4px'}),
105
+ ])
106
+
107
+
108
+ def chart_layout(title):
109
+ return dict(
110
+ template='plotly_dark', paper_bgcolor='rgba(0,0,0,0)', plot_bgcolor='rgba(0,0,0,0)',
111
+ title=dict(text=title, font=dict(size=16, color=COLORS['text'])),
112
+ font=dict(color=COLORS['text_muted'], size=12),
113
+ margin=dict(l=40, r=20, t=50, b=40),
114
+ legend=dict(bgcolor='rgba(0,0,0,0)'),
115
+ )
116
+
117
+
118
+ # ============================================================
119
+ # FIGURES
120
+ # ============================================================
121
+
122
+ # 1. Defect type distribution (pie) — YOLO predictions
123
+ defect_counts = df[df['status'] == 'FAIL']['defect_type'].value_counts().reset_index()
124
+ defect_counts.columns = ['defect_type', 'count']
125
+ fig_pie = px.pie(defect_counts, names='defect_type', values='count', color='defect_type',
126
+ color_discrete_map=DEFECT_COLORS, hole=0.45)
127
+ fig_pie.update_layout(**chart_layout('YOLOv8 Predicted Defect Distribution'))
128
+ fig_pie.update_traces(textinfo='label+percent', textfont_size=11)
129
+
130
+ # 2. Ground truth distribution (pie) — actual labels
131
+ gt_counts = df[df['status'] == 'FAIL']['ground_truth'].value_counts().reset_index()
132
+ gt_counts.columns = ['ground_truth', 'count']
133
+ fig_gt_pie = px.pie(gt_counts.head(15), names='ground_truth', values='count', hole=0.45)
134
+ fig_gt_pie.update_layout(**chart_layout('Ground Truth Label Distribution (Top 15)'))
135
+ fig_gt_pie.update_traces(textinfo='label+percent', textfont_size=10)
136
+
137
+ # 3. Material waste by defect type (bar)
138
+ waste_by_type = df[df['status'] == 'FAIL'].groupby('defect_type').agg(
139
+ total_waste=('material_wasted_pct', lambda x: x.sum() / 100.0), count=('id', 'count'),
140
+ ).reset_index().sort_values('total_waste', ascending=True)
141
+
142
+ fig_waste_bar = go.Figure()
143
+ fig_waste_bar.add_trace(go.Bar(
144
+ y=waste_by_type['defect_type'], x=waste_by_type['total_waste'], orientation='h',
145
+ marker_color=[DEFECT_COLORS.get(d, '#6b7280') for d in waste_by_type['defect_type']],
146
+ text=[f"{v:.1f}" for v in waste_by_type['total_waste']], textposition='outside',
147
+ ))
148
+ fig_waste_bar.update_layout(**chart_layout('Total Material Waste by Predicted Defect'))
149
+ fig_waste_bar.update_xaxes(title_text='Equivalent Lost Wafers')
150
+
151
+ # 4. Daily fail rate trend
152
+ fig_trend = go.Figure()
153
+ fig_trend.add_trace(go.Scatter(
154
+ x=daily['scan_date'], y=daily['fail_rate'], mode='lines+markers',
155
+ line=dict(color=COLORS['danger'], width=2), marker=dict(size=5),
156
+ name='Fail Rate %', fill='tozeroy', fillcolor='rgba(239, 68, 68, 0.1)',
157
+ ))
158
+ fig_trend.update_layout(**chart_layout('Daily Defect Rate Over Time'))
159
+ fig_trend.update_yaxes(title_text='Fail Rate %', range=[0, 105])
160
+
161
+ # 5. Daily waste trend
162
+ fig_waste_trend = go.Figure()
163
+ fig_waste_trend.add_trace(go.Scatter(
164
+ x=daily['scan_date'], y=daily['waste'], mode='lines+markers',
165
+ line=dict(color=COLORS['warning'], width=2), marker=dict(size=5),
166
+ name='Lost Wafers', fill='tozeroy', fillcolor='rgba(245, 158, 11, 0.1)',
167
+ ))
168
+ fig_waste_trend.update_layout(**chart_layout('Daily Material Waste Over Time'))
169
+ fig_waste_trend.update_yaxes(title_text='Total Lost Wafers')
170
+
171
+ # 6. Action breakdown
172
+ action_counts = df['action'].value_counts().reset_index()
173
+ action_counts.columns = ['action', 'count']
174
+ action_colors = {'ROUTE_TO_SCRAP': COLORS['danger'], 'MOVE_TO_MICRO_STAGE': COLORS['warning'], 'ROUTE_TO_ASSEMBLY': COLORS['accent3']}
175
+ fig_action = px.bar(action_counts, x='action', y='count', color='action',
176
+ color_discrete_map=action_colors, text='count')
177
+ fig_action.update_layout(**chart_layout('Wafer Routing Actions'))
178
+ fig_action.update_traces(textposition='outside')
179
+
180
+ # 7. Feature importance
181
+ fig_importance = go.Figure()
182
+ if model_pkg:
183
+ imp = model_pkg['metrics']['importances']
184
+ imp_df = pd.DataFrame({'feature': list(imp.keys()), 'importance': list(imp.values())})
185
+ imp_df = imp_df.sort_values('importance', ascending=True).tail(10)
186
+ fig_importance.add_trace(go.Bar(
187
+ y=imp_df['feature'], x=imp_df['importance'], orientation='h',
188
+ marker_color=COLORS['accent2'],
189
+ text=[f"{v:.3f}" for v in imp_df['importance']], textposition='outside',
190
+ ))
191
+ fig_importance.update_layout(**chart_layout('Top 10 Prediction Features'))
192
+
193
+ # ============================================================
194
+ # LAYOUT
195
+ # ============================================================
196
+ app.layout = html.Div(style={
197
+ 'backgroundColor': COLORS['bg'], 'minHeight': '100vh',
198
+ 'fontFamily': "'Inter', -apple-system, BlinkMacSystemFont, sans-serif",
199
+ 'color': COLORS['text'], 'padding': '24px 32px',
200
+ }, children=[
201
+
202
+ # HEADER
203
+ html.Div(style={'marginBottom': '32px', 'borderBottom': f"1px solid {COLORS['card_border']}", 'paddingBottom': '20px'}, children=[
204
+ html.Div(style={'display': 'flex', 'alignItems': 'center', 'gap': '12px'}, children=[
205
+ html.Div(style={'width': '12px', 'height': '12px', 'borderRadius': '50%',
206
+ 'backgroundColor': COLORS['accent3'], 'boxShadow': f"0 0 8px {COLORS['accent3']}"}),
207
+ html.H1("Wafer Defect Analytics", style={
208
+ 'margin': '0', 'fontSize': '28px', 'fontWeight': '700',
209
+ 'background': f"linear-gradient(135deg, {COLORS['accent']}, {COLORS['accent2']})",
210
+ 'WebkitBackgroundClip': 'text', 'WebkitTextFillColor': 'transparent',
211
+ }),
212
+ ]),
213
+ html.P("Mixed-type Wafer Defect Dataset — Material Waste Dashboard",
214
+ style={'color': COLORS['text_muted'], 'marginTop': '4px', 'fontSize': '14px'}),
215
+ ]),
216
+
217
+ # TABS
218
+ dcc.Tabs(id='tabs', value='tab-waste', style={'marginBottom': '24px'}, children=[
219
+ dcc.Tab(label='📊 Historical Waste Analysis', value='tab-waste', style={
220
+ 'backgroundColor': COLORS['card'], 'color': COLORS['text_muted'],
221
+ 'border': f"1px solid {COLORS['card_border']}", 'borderRadius': '8px 8px 0 0',
222
+ 'padding': '12px 24px', 'fontWeight': '600',
223
+ }, selected_style={
224
+ 'backgroundColor': COLORS['accent2'], 'color': '#fff',
225
+ 'border': f"1px solid {COLORS['accent2']}", 'borderRadius': '8px 8px 0 0',
226
+ 'padding': '12px 24px', 'fontWeight': '600',
227
+ }),
228
+ dcc.Tab(label='🔮 Material Prediction', value='tab-predict', style={
229
+ 'backgroundColor': COLORS['card'], 'color': COLORS['text_muted'],
230
+ 'border': f"1px solid {COLORS['card_border']}", 'borderRadius': '8px 8px 0 0',
231
+ 'padding': '12px 24px', 'fontWeight': '600',
232
+ }, selected_style={
233
+ 'backgroundColor': COLORS['accent2'], 'color': '#fff',
234
+ 'border': f"1px solid {COLORS['accent2']}", 'borderRadius': '8px 8px 0 0',
235
+ 'padding': '12px 24px', 'fontWeight': '600',
236
+ }),
237
+ ]),
238
+
239
+ html.Div(id='tab-content'),
240
+ ])
241
+
242
+
243
+ # ============================================================
244
+ # CALLBACKS
245
+ # ============================================================
246
+ @callback(Output('tab-content', 'children'), Input('tabs', 'value'))
247
+ def render_tab(tab):
248
+ if tab == 'tab-waste':
249
+ return html.Div([
250
+ # KPI Row
251
+ html.Div(style={'display': 'flex', 'gap': '12px', 'flexWrap': 'wrap', 'marginBottom': '24px'}, children=[
252
+ make_kpi("Total Scans", f"{total_scans:,}", "wafers inspected", COLORS['accent']),
253
+ make_kpi("Pass Rate", f"{pass_rate}%", f"{pass_count:,} passed", COLORS['accent3']),
254
+ make_kpi("Fail Rate", f"{100-pass_rate}%", f"{fail_count:,} defective", COLORS['danger']),
255
+ make_kpi("Scrapped", f"{scrap_count:,}", "routed to scrap", COLORS['warning']),
256
+ make_kpi("Avg Waste/Wafer", f"{avg_waste}%", "per defective wafer", COLORS['danger']),
257
+ make_kpi("Avg Confidence", f"{avg_confidence}", "model certainty", COLORS['accent3']),
258
+ ]),
259
+
260
+ # Charts Row 1: YOLO predictions vs Ground Truth
261
+ html.Div(style={'display': 'grid', 'gridTemplateColumns': '1fr 1fr', 'gap': '16px', 'marginBottom': '16px'}, children=[
262
+ html.Div(style=card_style, children=[dcc.Graph(figure=fig_pie, config={'displayModeBar': False})]),
263
+ html.Div(style=card_style, children=[dcc.Graph(figure=fig_gt_pie, config={'displayModeBar': False})]),
264
+ ]),
265
+
266
+ # Charts Row 2
267
+ html.Div(style={'display': 'grid', 'gridTemplateColumns': '1fr 1fr', 'gap': '16px', 'marginBottom': '16px'}, children=[
268
+ html.Div(style=card_style, children=[dcc.Graph(figure=fig_waste_bar, config={'displayModeBar': False})]),
269
+ html.Div(style=card_style, children=[dcc.Graph(figure=fig_action, config={'displayModeBar': False})]),
270
+ ]),
271
+
272
+ # Trend charts
273
+ html.Div(style={'display': 'grid', 'gridTemplateColumns': '1fr 1fr', 'gap': '16px'}, children=[
274
+ html.Div(style=card_style, children=[dcc.Graph(figure=fig_trend, config={'displayModeBar': False})]),
275
+ html.Div(style=card_style, children=[dcc.Graph(figure=fig_waste_trend, config={'displayModeBar': False})]),
276
+ ]),
277
+ ])
278
+
279
+ elif tab == 'tab-predict':
280
+ model_status = "✅ Model loaded" if model_pkg else "❌ No model found"
281
+ model_metrics = ""
282
+ if model_pkg:
283
+ m = model_pkg['metrics']
284
+ model_metrics = f"R² = {m['r2']:.4f} | MAE = {m['mae']:.2f}%"
285
+
286
+ return html.Div([
287
+ html.Div(style={**card_style, 'display': 'flex', 'justifyContent': 'space-between', 'alignItems': 'center'}, children=[
288
+ html.Div([
289
+ html.H3("Prediction Model", style={'margin': '0 0 4px 0', 'fontSize': '18px'}),
290
+ html.P(model_status, style={'margin': '0', 'color': COLORS['accent3'] if model_pkg else COLORS['danger']}),
291
+ ]),
292
+ html.P(model_metrics, style={'color': COLORS['text_muted'], 'fontFamily': 'monospace'}),
293
+ ]),
294
+
295
+ html.Div(style=card_style, children=[
296
+ html.H3("Forecast Parameters", style={'marginTop': '0', 'fontSize': '18px', 'marginBottom': '20px'}),
297
+ html.Div(style={'display': 'grid', 'gridTemplateColumns': '1fr 1fr', 'gap': '24px'}, children=[
298
+ html.Div([
299
+ html.Label("Expected Daily Production (wafers)", style={'color': COLORS['text_muted'], 'fontSize': '13px'}),
300
+ dcc.Slider(id='slider-scans', min=100, max=2000, step=50, value=1300,
301
+ marks={100: '100', 500: '500', 1000: '1000', 1500: '1500', 2000: '2000'},
302
+ tooltip={"placement": "bottom", "always_visible": True}),
303
+ ]),
304
+ html.Div([
305
+ html.Label("Expected Defect Rate (%)", style={'color': COLORS['text_muted'], 'fontSize': '13px'}),
306
+ dcc.Slider(id='slider-fail-rate', min=0, max=100, step=5, value=97,
307
+ marks={0: '0%', 25: '25%', 50: '50%', 75: '75%', 100: '100%'},
308
+ tooltip={"placement": "bottom", "always_visible": True}),
309
+ ]),
310
+ ]),
311
+ html.Br(),
312
+ html.Button("🔮 Predict Material Needs", id='btn-predict', n_clicks=0, style={
313
+ 'backgroundColor': COLORS['accent2'], 'color': '#fff', 'border': 'none',
314
+ 'borderRadius': '8px', 'padding': '12px 32px', 'fontSize': '15px',
315
+ 'fontWeight': '600', 'cursor': 'pointer', 'width': '100%',
316
+ }),
317
+ ]),
318
+
319
+ html.Div(id='prediction-result'),
320
+
321
+ html.Div(style=card_style, children=[
322
+ dcc.Graph(figure=fig_importance, config={'displayModeBar': False}),
323
+ ]),
324
+ ])
325
+
326
+
327
+ @callback(
328
+ Output('prediction-result', 'children'),
329
+ Input('btn-predict', 'n_clicks'),
330
+ State('slider-scans', 'value'),
331
+ State('slider-fail-rate', 'value'),
332
+ prevent_initial_call=True,
333
+ )
334
+ def predict(n_clicks, n_scans, fail_pct):
335
+ if not model_pkg:
336
+ return html.Div(style={**card_style, 'borderColor': COLORS['danger']}, children=[
337
+ html.P("❌ No model loaded.", style={'color': COLORS['danger']}),
338
+ ])
339
+
340
+ from material_predictor import predict_material_needs
341
+ model = model_pkg['model']
342
+ feat_cols = model_pkg['feature_cols']
343
+
344
+ fail_df = df[df['status'] == 'FAIL']
345
+ dist = fail_df['defect_type'].value_counts(normalize=True).to_dict()
346
+
347
+ pred = predict_material_needs(model, feat_cols, n_scans, fail_pct / 100.0, dist)
348
+
349
+ return html.Div(style={
350
+ **card_style, 'borderColor': COLORS['accent'],
351
+ 'background': f"linear-gradient(135deg, {COLORS['card']}, #1a1040)",
352
+ }, children=[
353
+ html.Div(style={'display': 'flex', 'justifyContent': 'space-around', 'flexWrap': 'wrap', 'gap': '16px'}, children=[
354
+ make_kpi("Daily Production", f"{n_scans}", "wafers", COLORS['accent']),
355
+ make_kpi("Expected Defect Rate", f"{fail_pct}%", f"~{int(n_scans * fail_pct / 100)} defective", COLORS['danger']),
356
+ make_kpi("Avg Waste/Wafer", f"{pred['avg_waste_per_wafer']:.1f}%", "per defective wafer loss", COLORS['warning']),
357
+ make_kpi("Total Daily Waste", f"{pred['total_daily_waste']:.1f} wafers", "total estimated loss", COLORS['danger']),
358
+ ]),
359
+ ])
360
+
361
+
362
+ if __name__ == '__main__':
363
+ print("\n" + "=" * 60)
364
+ print(" WAFER DEFECT ANALYTICS DASHBOARD")
365
+ print(f" Data: {total_scans:,} scans | {fail_count:,} defects | {pass_count:,} pass")
366
+ print(f" Model: {'Loaded ✅' if model_pkg else 'Not found ❌'}")
367
+ print("=" * 60)
368
+ print(f"\n 🌐 Open: http://127.0.0.1:8050\n")
369
+ app.run(debug=True, port=8050)
middleware/database.py ADDED
File without changes
middleware/material_model.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:47b26c94a138551231a0bb2a9a7b2dfe918c278224048c4fa12a22feacf6a0b2
3
+ size 240685
middleware/material_predictor.py ADDED
@@ -0,0 +1,229 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Material Predictor — Phase 5
3
+ Trains a Random Forest model on historical wafer scan data to predict
4
+ material waste percentage for future production batches.
5
+ """
6
+
7
+ import os
8
+ import pickle
9
+ import sqlite3
10
+ import numpy as np
11
+ import pandas as pd
12
+ from sklearn.ensemble import RandomForestRegressor
13
+ from sklearn.model_selection import train_test_split
14
+ from sklearn.metrics import mean_absolute_error, r2_score
15
+
16
+ # --- CONFIGURATION ---
17
+ DB_PATH = os.path.join(os.path.dirname(__file__), 'wafer_control.db')
18
+ MODEL_PATH = os.path.join(os.path.dirname(__file__), 'material_model.pkl')
19
+
20
+ # Defect types for feature engineering
21
+ DEFECT_TYPES = ['Center', 'Donut', 'Edge-Loc', 'Edge-Ring', 'Loc', 'Random', 'Scratch', 'Near-full', 'None', 'Undetected']
22
+
23
+
24
+ def load_data():
25
+ """Load wafer logs from the SQLite database into a DataFrame."""
26
+ conn = sqlite3.connect(DB_PATH)
27
+ df = pd.read_sql_query("SELECT * FROM wafer_logs", conn)
28
+ conn.close()
29
+ df['scan_time'] = pd.to_datetime(df['scan_time'])
30
+ return df
31
+
32
+
33
+ def engineer_features(df):
34
+ """
35
+ Build daily-aggregated features from raw scan logs.
36
+ Each row = one day of production with aggregated metrics.
37
+ """
38
+ df['scan_date'] = df['scan_time'].dt.date
39
+ df['is_fail'] = (df['status'] == 'FAIL').astype(int)
40
+ df['is_scrap'] = (df['action'] == 'ROUTE_TO_SCRAP').astype(int)
41
+
42
+ # One-hot encode defect types per scan
43
+ for defect in DEFECT_TYPES:
44
+ col_name = f'is_{defect.lower().replace("-", "_")}'
45
+ df[col_name] = (df['defect_type'] == defect).astype(int)
46
+
47
+ # --- Aggregate by day ---
48
+ daily = df.groupby('scan_date').agg(
49
+ total_scans=('id', 'count'),
50
+ fail_count=('is_fail', 'sum'),
51
+ scrap_count=('is_scrap', 'sum'),
52
+ avg_confidence=('confidence', 'mean'),
53
+ avg_defect_area=('defect_area_px', 'mean'),
54
+ max_defect_area=('defect_area_px', 'max'),
55
+ total_waste_pct=('material_wasted_pct', 'sum'),
56
+ avg_waste_pct=('material_wasted_pct', 'mean'),
57
+ # Defect type counts per day
58
+ center_count=('is_center', 'sum'),
59
+ donut_count=('is_donut', 'sum'),
60
+ edge_loc_count=('is_edge_loc', 'sum'),
61
+ edge_ring_count=('is_edge_ring', 'sum'),
62
+ loc_count=('is_loc', 'sum'),
63
+ random_count=('is_random', 'sum'),
64
+ scratch_count=('is_scratch', 'sum'),
65
+ near_full_count=('is_near_full', 'sum'),
66
+ pass_count=('is_none', 'sum'),
67
+ ).reset_index()
68
+
69
+ # --- Compute waste among defective wafers only ---
70
+ defective_daily = df[df['status'] == 'FAIL'].groupby('scan_date').agg(
71
+ avg_waste_defective=('material_wasted_pct', 'mean'),
72
+ avg_defect_area_fail=('defect_area_px', 'mean'),
73
+ avg_confidence_fail=('confidence', 'mean'),
74
+ ).reset_index()
75
+
76
+ daily = daily.merge(defective_daily, on='scan_date', how='left')
77
+ daily['avg_waste_defective'] = daily['avg_waste_defective'].fillna(0)
78
+ daily['avg_defect_area_fail'] = daily['avg_defect_area_fail'].fillna(0)
79
+ daily['avg_confidence_fail'] = daily['avg_confidence_fail'].fillna(0)
80
+
81
+ # Derived ratios
82
+ daily['fail_rate'] = daily['fail_count'] / daily['total_scans']
83
+ daily['scrap_rate'] = daily['scrap_count'] / daily['total_scans']
84
+
85
+ # Time features
86
+ daily['scan_date'] = pd.to_datetime(daily['scan_date'])
87
+ daily['day_of_week'] = daily['scan_date'].dt.dayofweek
88
+ daily['day_index'] = (daily['scan_date'] - daily['scan_date'].min()).dt.days
89
+
90
+ return daily
91
+
92
+
93
+ def train_model(daily):
94
+ """Train a Random Forest to predict avg material waste among defective wafers."""
95
+ feature_cols = [
96
+ 'total_scans', 'fail_count', 'scrap_count', 'avg_confidence',
97
+ 'avg_defect_area', 'max_defect_area', 'fail_rate', 'scrap_rate',
98
+ 'avg_defect_area_fail', 'avg_confidence_fail',
99
+ 'center_count', 'donut_count', 'edge_loc_count', 'edge_ring_count',
100
+ 'loc_count', 'random_count', 'scratch_count', 'near_full_count',
101
+ 'pass_count', 'day_of_week', 'day_index'
102
+ ]
103
+
104
+ target = 'avg_waste_defective'
105
+
106
+ X = daily[feature_cols]
107
+ y = daily[target]
108
+
109
+ X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
110
+
111
+ model = RandomForestRegressor(
112
+ n_estimators=100,
113
+ max_depth=10,
114
+ random_state=42,
115
+ n_jobs=-1
116
+ )
117
+ model.fit(X_train, y_train)
118
+
119
+ # Evaluate
120
+ y_pred = model.predict(X_test)
121
+ mae = mean_absolute_error(y_test, y_pred)
122
+ r2 = r2_score(y_test, y_pred)
123
+
124
+ print(f"\n{'=' * 50}")
125
+ print(f" MODEL EVALUATION (target: avg waste % per wafer)")
126
+ print(f" Mean Absolute Error: {mae:.2f}%")
127
+ print(f" R² Score: {r2:.4f}")
128
+ print(f"{'=' * 50}")
129
+
130
+ # Feature importance
131
+ importance = pd.Series(model.feature_importances_, index=feature_cols).sort_values(ascending=False)
132
+ print(f"\n Top 5 Feature Importances:")
133
+ for feat, imp in importance.head(5).items():
134
+ print(f" {feat:25s} {imp:.4f}")
135
+
136
+ return model, feature_cols, {'mae': mae, 'r2': r2, 'importances': importance.to_dict()}
137
+
138
+
139
+ def predict_material_needs(model, feature_cols, total_scans, fail_rate, defect_distribution):
140
+ """
141
+ Predict material waste for a hypothetical future production day.
142
+ """
143
+ fail_count = int(total_scans * fail_rate)
144
+ pass_count = total_scans - fail_count
145
+
146
+ features = {
147
+ 'total_scans': total_scans,
148
+ 'fail_count': fail_count,
149
+ 'scrap_count': int(fail_count * defect_distribution.get('Center', 0) +
150
+ fail_count * defect_distribution.get('Near-full', 0)),
151
+ 'avg_confidence': 0.95,
152
+ 'avg_defect_area': 1500,
153
+ 'max_defect_area': 2704,
154
+ 'fail_rate': fail_rate,
155
+ 'scrap_rate': defect_distribution.get('Center', 0) + defect_distribution.get('Near-full', 0),
156
+ 'avg_defect_area_fail': 1500,
157
+ 'avg_confidence_fail': 0.85,
158
+ 'center_count': int(fail_count * defect_distribution.get('Center', 0)),
159
+ 'donut_count': int(fail_count * defect_distribution.get('Donut', 0)),
160
+ 'edge_loc_count': int(fail_count * defect_distribution.get('Edge-Loc', 0)),
161
+ 'edge_ring_count': int(fail_count * defect_distribution.get('Edge-Ring', 0)),
162
+ 'loc_count': int(fail_count * defect_distribution.get('Loc', 0)),
163
+ 'random_count': int(fail_count * defect_distribution.get('Random', 0)),
164
+ 'scratch_count': int(fail_count * defect_distribution.get('Scratch', 0)),
165
+ 'near_full_count': int(fail_count * defect_distribution.get('Near-full', 0)),
166
+ 'pass_count': pass_count,
167
+ 'day_of_week': 2,
168
+ 'day_index': 30,
169
+ }
170
+
171
+ X = pd.DataFrame([features])[feature_cols]
172
+ avg_waste_per_wafer = model.predict(X)[0]
173
+ total_waste_wafers = (avg_waste_per_wafer / 100.0) * fail_count
174
+ return {
175
+ 'avg_waste_per_wafer': round(avg_waste_per_wafer, 2),
176
+ 'total_daily_waste': round(total_waste_wafers, 1),
177
+ 'total_scans': total_scans,
178
+ 'fail_rate': fail_rate,
179
+ }
180
+
181
+
182
+ if __name__ == '__main__':
183
+ print("=" * 50)
184
+ print(" MATERIAL WASTE PREDICTOR — Training")
185
+ print("=" * 50)
186
+
187
+ # 1. Load and engineer features
188
+ print("\nLoading scan data...")
189
+ raw_df = load_data()
190
+ print(f" Total records: {len(raw_df)}")
191
+ print(f" PASS: {len(raw_df[raw_df['status'] == 'PASS'])}")
192
+ print(f" FAIL: {len(raw_df[raw_df['status'] == 'FAIL'])}")
193
+
194
+ print("Engineering daily features...")
195
+ daily_df = engineer_features(raw_df)
196
+ print(f" Training days: {len(daily_df)}")
197
+
198
+ # 2. Train
199
+ print("\nTraining Random Forest model...")
200
+ trained_model, feat_cols, metrics = train_model(daily_df)
201
+
202
+ # 3. Save
203
+ model_package = {
204
+ 'model': trained_model,
205
+ 'feature_cols': feat_cols,
206
+ 'metrics': metrics,
207
+ }
208
+ with open(MODEL_PATH, 'wb') as f:
209
+ pickle.dump(model_package, f)
210
+ print(f"\nModel saved to: {MODEL_PATH}")
211
+
212
+ # 4. Demo prediction
213
+ print(f"\n{'=' * 50}")
214
+ print(" DEMO PREDICTION")
215
+ print(f"{'=' * 50}")
216
+
217
+ demo_distribution = {
218
+ 'Center': 0.15, 'Edge-Ring': 0.37, 'Edge-Loc': 0.06,
219
+ 'Donut': 0.23, 'Random': 0.03, 'Scratch': 0.03,
220
+ 'Loc': 0.10, 'Near-full': 0.01
221
+ }
222
+
223
+ pred = predict_material_needs(trained_model, feat_cols,
224
+ total_scans=1300, fail_rate=0.97,
225
+ defect_distribution=demo_distribution)
226
+ print(f" Scenario: 1,300 wafers/day, 97% defect rate")
227
+ print(f" Predicted avg waste per wafer: {pred['avg_waste_per_wafer']:.2f}%")
228
+ print(f" Predicted total daily waste: {pred['total_daily_waste']:.1f} equivalent wafers")
229
+ print(f"{'=' * 50}")
middleware/robot_controller.py ADDED
@@ -0,0 +1,278 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Robotic Control Middleware — Full Production Scan
3
+ Phase 1: Loads Mixed-type Wafer Defect Dataset, runs YOLOv8 on defective wafers.
4
+ Phase 2: Loads ALL passed wafers from WM-811K dataset (direct insert, no YOLO needed).
5
+ """
6
+
7
+ import os
8
+ import sys
9
+ import pickle
10
+ import sqlite3
11
+ import random
12
+ import cv2
13
+ import time
14
+ import numpy as np
15
+ from datetime import datetime, timedelta
16
+ from ultralytics import YOLO
17
+
18
+ # Fix for old Pandas architecture in WM-811K pickle
19
+ import pandas.core.indexes
20
+ sys.modules['pandas.indexes'] = pandas.core.indexes
21
+ import pandas as pd
22
+
23
+ # --- CONFIGURATION ---
24
+ NPZ_PATH = os.path.expanduser(
25
+ '~/.cache/kagglehub/datasets/co1d7era/mixedtype-wafer-defect-datasets/versions/4/Wafer_Map_Datasets.npz'
26
+ )
27
+ WM811K_PATH = os.path.expanduser(
28
+ '~/.cache/kagglehub/datasets/qingyi/wm811k-wafer-map/versions/1/LSWMD.pkl'
29
+ )
30
+ MODEL_PATH = 'middleware/best.pt'
31
+ DB_PATH = os.path.join('middleware', 'wafer_control.db')
32
+
33
+ # Defect names matching the 8-dim one-hot encoding order in the dataset
34
+ DEFECT_NAMES = ['Center', 'Donut', 'Edge-Loc', 'Edge-Ring', 'Loc', 'Near-full', 'Random', 'Scratch']
35
+
36
+
37
+ def setup_database():
38
+ """Creates a fresh wafer_logs table with ground_truth column."""
39
+ os.makedirs('middleware', exist_ok=True)
40
+ conn = sqlite3.connect(DB_PATH)
41
+ cursor = conn.cursor()
42
+
43
+ cursor.execute('DROP TABLE IF EXISTS wafer_logs')
44
+ cursor.execute('''
45
+ CREATE TABLE wafer_logs (
46
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
47
+ wafer_id TEXT,
48
+ batch_id TEXT,
49
+ scan_time TEXT,
50
+ status TEXT,
51
+ ground_truth TEXT,
52
+ defect_type TEXT,
53
+ action TEXT,
54
+ confidence REAL,
55
+ roi_coordinates TEXT,
56
+ defect_area_px INTEGER,
57
+ material_wasted_pct REAL
58
+ )
59
+ ''')
60
+ conn.commit()
61
+ return conn
62
+
63
+
64
+ def decode_label(one_hot):
65
+ """Convert 8-dim one-hot label to human-readable defect string."""
66
+ active = np.where(one_hot == 1)[0]
67
+ if len(active) == 0:
68
+ return 'Normal'
69
+ return '+'.join([DEFECT_NAMES[i] for i in active])
70
+
71
+
72
+ def wafer_to_image(wafer_map):
73
+ """Convert a 52x52 wafer map array to a 3-channel BGR image for YOLOv8."""
74
+ img = np.zeros(wafer_map.shape, dtype=np.uint8)
75
+ img[wafer_map == 1] = 127 # Normal die → gray
76
+ img[wafer_map == 2] = 255 # Broken die → white
77
+ img[wafer_map == 3] = 255 # Treat 3 as defect too (rare edge artifact)
78
+ # YOLOv8 expects 3-channel (BGR) images
79
+ img_bgr = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
80
+ return img_bgr
81
+
82
+
83
+ def compute_defect_area(coords):
84
+ """Calculate bounding box area in pixels from [x1, y1, x2, y2]."""
85
+ if not coords or len(coords) != 4:
86
+ return 0
87
+ x1, y1, x2, y2 = coords
88
+ return max(0, (x2 - x1) * (y2 - y1))
89
+
90
+
91
+ def run_production_scan(conn, model, wafer_maps, labels, batch_id, start_time):
92
+ """Process all wafers: YOLO inference on defective, direct insert for normals."""
93
+ cursor = conn.cursor()
94
+ total = len(wafer_maps)
95
+
96
+ for i in range(total):
97
+ wafer_id = f"wafer_{i}"
98
+ ground_truth = decode_label(labels[i])
99
+
100
+ # Distribute realistic timestamps with Gaussian noise to create natural defect spikes
101
+ base_day = (i / total) * 30
102
+ noisy_day = base_day + random.gauss(0, 4) # High variance for defects
103
+ days_offset = int(max(0, min(29, noisy_day)))
104
+ scan_time = start_time + timedelta(
105
+ days=days_offset,
106
+ seconds=random.randint(0, 68)
107
+ )
108
+ scan_time_str = scan_time.strftime("%Y-%m-%d %H:%M:%S")
109
+
110
+ if ground_truth == 'Normal':
111
+ # PASS wafer — no YOLO needed
112
+ cursor.execute('''
113
+ INSERT INTO wafer_logs
114
+ (wafer_id, batch_id, scan_time, status, ground_truth, defect_type, action, confidence, roi_coordinates, defect_area_px, material_wasted_pct)
115
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
116
+ ''', (wafer_id, batch_id, scan_time_str, "PASS", ground_truth, "None", "ROUTE_TO_ASSEMBLY", 1.0, "[]", 0, 0.0))
117
+ else:
118
+ # Defective wafer — convert to image and run YOLO
119
+ img = wafer_to_image(wafer_maps[i])
120
+ wafer_area_px = img.shape[0] * img.shape[1] # 52*52 = 2704
121
+
122
+ results = model.predict(source=img, conf=0.25, verbose=False)
123
+ boxes = results[0].boxes
124
+
125
+ if len(boxes) > 0:
126
+ box = boxes[0]
127
+ class_id = int(box.cls[0].item())
128
+ defect_type = model.names[class_id]
129
+ confidence = round(box.conf[0].item(), 2)
130
+
131
+ x1, y1, x2, y2 = [int(x) for x in box.xyxy[0].tolist()]
132
+ coords = [x1, y1, x2, y2]
133
+
134
+ status = "FAIL"
135
+ action = "ROUTE_TO_SCRAP" if defect_type in ["Center", "Near-full"] else "MOVE_TO_MICRO_STAGE"
136
+ else:
137
+ # YOLO didn't detect anything (could be mixed pattern it can't see)
138
+ status = "FAIL"
139
+ defect_type = "Undetected"
140
+ action = "MOVE_TO_MICRO_STAGE"
141
+ confidence = 0.0
142
+ coords = []
143
+
144
+ defect_area = compute_defect_area(coords)
145
+ material_wasted_pct = round((defect_area / wafer_area_px) * 100, 2) if defect_area > 0 else 0.0
146
+
147
+ cursor.execute('''
148
+ INSERT INTO wafer_logs
149
+ (wafer_id, batch_id, scan_time, status, ground_truth, defect_type, action, confidence, roi_coordinates, defect_area_px, material_wasted_pct)
150
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
151
+ ''', (wafer_id, batch_id, scan_time_str, status, ground_truth, defect_type, action, confidence, str(coords), defect_area, material_wasted_pct))
152
+
153
+ # Commit in batches
154
+ if (i + 1) % 500 == 0:
155
+ conn.commit()
156
+ print(f" Processed {i + 1}/{total} wafers...")
157
+
158
+ conn.commit()
159
+
160
+
161
+ def insert_wm811k_passed(conn, batch_id, start_time):
162
+ """Load ALL passed wafers from WM-811K and insert directly into DB."""
163
+ print("Loading WM-811K dataset...")
164
+ with open(WM811K_PATH, 'rb') as f:
165
+ wm_df = pickle.load(f, encoding='latin1')
166
+
167
+ wm_df['failure_class'] = wm_df['failureType'].apply(lambda x: x[0][0] if len(x) > 0 else 'None')
168
+ passed = wm_df[(wm_df['failure_class'] == 'None') | (wm_df['failure_class'] == 'none')]
169
+ passed = passed[passed['waferMap'].apply(lambda x: isinstance(x, np.ndarray) and x.size > 0)]
170
+
171
+ total = len(passed)
172
+ print(f" Found {total:,} passed wafers in WM-811K")
173
+ print(f" Inserting all into database...")
174
+
175
+ cursor = conn.cursor()
176
+ rows = []
177
+ for i, (index, row) in enumerate(passed.iterrows()):
178
+ # Stable production schedule with low variance for normal wafers
179
+ base_day = (i / total) * 30
180
+ noisy_day = base_day + random.gauss(0, 0.5)
181
+ days_offset = int(max(0, min(29, noisy_day)))
182
+ scan_time = start_time + timedelta(
183
+ days=days_offset,
184
+ seconds=random.randint(0, 3)
185
+ )
186
+ rows.append((
187
+ f"wm811k_{index}", batch_id, scan_time.strftime("%Y-%m-%d %H:%M:%S"),
188
+ "PASS", "Normal", "None", "ROUTE_TO_ASSEMBLY", 1.0, "[]", 0, 0.0
189
+ ))
190
+
191
+ if len(rows) >= 10000:
192
+ cursor.executemany('''
193
+ INSERT INTO wafer_logs
194
+ (wafer_id, batch_id, scan_time, status, ground_truth, defect_type, action, confidence, roi_coordinates, defect_area_px, material_wasted_pct)
195
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
196
+ ''', rows)
197
+ conn.commit()
198
+ rows = []
199
+ print(f" Inserted {i + 1:,}/{total:,} passed wafers...")
200
+
201
+ if rows:
202
+ cursor.executemany('''
203
+ INSERT INTO wafer_logs
204
+ (wafer_id, batch_id, scan_time, status, ground_truth, defect_type, action, confidence, roi_coordinates, defect_area_px, material_wasted_pct)
205
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
206
+ ''', rows)
207
+ conn.commit()
208
+
209
+ print(f" Done! Inserted {total:,} passed wafers.")
210
+ return total
211
+
212
+
213
+ if __name__ == '__main__':
214
+ print("=" * 60)
215
+ print(" ROBOTIC CONTROL MIDDLEWARE — HYBRID PRODUCTION SCAN")
216
+ print("=" * 60)
217
+
218
+ # 1. Load Mixed-type dataset
219
+ print("\nPhase 1: Loading Mixed-type Wafer Defect Dataset...")
220
+ data = np.load(NPZ_PATH)
221
+ X = data['arr_0']
222
+ Y = data['arr_1']
223
+
224
+ normals_mixed = sum(1 for y in Y if np.sum(y) == 0)
225
+ defective = len(X) - normals_mixed
226
+
227
+ print(f" Mixed-type: {len(X):,} total ({defective:,} defective + {normals_mixed:,} normal)")
228
+ print(f" WM-811K: ~786K passed wafers")
229
+
230
+ # 2. Setup
231
+ db_connection = setup_database()
232
+ wafer_model = YOLO(MODEL_PATH)
233
+
234
+ batch_id = f"BATCH_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
235
+ start_time = datetime.now() - timedelta(days=30)
236
+
237
+ print(f"\n Batch ID: {batch_id}")
238
+ print(f" Scan window: {start_time.strftime('%Y-%m-%d')} → {datetime.now().strftime('%Y-%m-%d')}")
239
+
240
+ try:
241
+ # PHASE 1: YOLOv8 on Mixed-type defective wafers
242
+ print(f"\n{'=' * 60}")
243
+ print(f" PHASE 1: YOLOv8 Inference ({len(X):,} Mixed-type wafers)")
244
+ print(f"{'=' * 60}\n")
245
+
246
+ t0 = time.time()
247
+ run_production_scan(db_connection, wafer_model, X, Y, batch_id, start_time)
248
+ t1 = time.time()
249
+ print(f"\n Phase 1 complete: {t1 - t0:.1f}s")
250
+
251
+ # PHASE 2: All passed wafers from WM-811K
252
+ print(f"\n{'=' * 60}")
253
+ print(f" PHASE 2: WM-811K Passed Wafers (all ~786K)")
254
+ print(f"{'=' * 60}\n")
255
+
256
+ passed_count = insert_wm811k_passed(db_connection, batch_id, start_time)
257
+ t2 = time.time()
258
+
259
+ total = len(X) + passed_count
260
+ print(f"\n{'=' * 60}")
261
+ print(f" SCAN COMPLETE")
262
+ print(f" Defective (Mixed-type): {defective:,}")
263
+ print(f" Normal (Mixed-type): {normals_mixed:,}")
264
+ print(f" Passed (WM-811K): {passed_count:,}")
265
+ print(f" Total records: {total:,}")
266
+ print(f" Pass rate: {(normals_mixed + passed_count) / total * 100:.1f}%")
267
+ print(f" Time elapsed: {t2 - t0:.1f}s")
268
+ print(f" Database: {DB_PATH}")
269
+ print(f"{'=' * 60}")
270
+
271
+ except Exception as e:
272
+ print(f"\nError during scan: {e}")
273
+ import traceback
274
+ traceback.print_exc()
275
+
276
+ finally:
277
+ db_connection.close()
278
+ print("Database connection closed.")
middleware/wafer_control.db ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f38b7b11d6ac3e7f8292a4bc48c23de272ae29c0c728ef62f62a42a4bc42406b
3
+ size 90316800
notebooks/01_data_exploration.ipynb ADDED
@@ -0,0 +1,399 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": null,
6
+ "id": "03c79b89",
7
+ "metadata": {},
8
+ "outputs": [],
9
+ "source": []
10
+ },
11
+ {
12
+ "cell_type": "code",
13
+ "execution_count": 1,
14
+ "id": "e517de18",
15
+ "metadata": {},
16
+ "outputs": [
17
+ {
18
+ "ename": "ModuleNotFoundError",
19
+ "evalue": "No module named 'kagglehub'",
20
+ "output_type": "error",
21
+ "traceback": [
22
+ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
23
+ "\u001b[1;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)",
24
+ "Cell \u001b[1;32mIn[1], line 2\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21;01msys\u001b[39;00m\n\u001b[1;32m----> 2\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21;01mkagglehub\u001b[39;00m\n\u001b[0;32m 3\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21;01mpandas\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;28;01mas\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21;01mpd\u001b[39;00m\n\u001b[0;32m 4\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21;01mnumpy\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;28;01mas\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21;01mnp\u001b[39;00m\n",
25
+ "\u001b[1;31mModuleNotFoundError\u001b[0m: No module named 'kagglehub'"
26
+ ]
27
+ }
28
+ ],
29
+ "source": [
30
+ "import sys\n",
31
+ "import kagglehub\n",
32
+ "import pandas as pd\n",
33
+ "import numpy as np\n",
34
+ "import matplotlib.pyplot as plt\n",
35
+ "import os\n",
36
+ "import pickle"
37
+ ]
38
+ },
39
+ {
40
+ "cell_type": "code",
41
+ "execution_count": 2,
42
+ "id": "f7ac0941",
43
+ "metadata": {},
44
+ "outputs": [],
45
+ "source": [
46
+ "import pandas.core.indexes\n",
47
+ "sys.modules['pandas.indexes'] = pandas.core.indexes"
48
+ ]
49
+ },
50
+ {
51
+ "cell_type": "code",
52
+ "execution_count": 3,
53
+ "id": "82aec67e",
54
+ "metadata": {},
55
+ "outputs": [
56
+ {
57
+ "name": "stdout",
58
+ "output_type": "stream",
59
+ "text": [
60
+ "Downloading dataset from Kaggle...\n"
61
+ ]
62
+ },
63
+ {
64
+ "ename": "NameError",
65
+ "evalue": "name 'kagglehub' is not defined",
66
+ "output_type": "error",
67
+ "traceback": [
68
+ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
69
+ "\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)",
70
+ "Cell \u001b[1;32mIn[3], line 2\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mDownloading dataset from Kaggle...\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m----> 2\u001b[0m path \u001b[38;5;241m=\u001b[39m \u001b[43mkagglehub\u001b[49m\u001b[38;5;241m.\u001b[39mdataset_download(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mqingyi/wm811k-wafer-map\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m 3\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mDataset downloaded to: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mpath\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n",
71
+ "\u001b[1;31mNameError\u001b[0m: name 'kagglehub' is not defined"
72
+ ]
73
+ }
74
+ ],
75
+ "source": [
76
+ "print(\"Downloading dataset from Kaggle...\")\n",
77
+ "path = kagglehub.dataset_download(\"qingyi/wm811k-wafer-map\")\n",
78
+ "print(f\"Dataset downloaded to: {path}\")"
79
+ ]
80
+ },
81
+ {
82
+ "cell_type": "code",
83
+ "execution_count": 9,
84
+ "id": "4dea4d86",
85
+ "metadata": {},
86
+ "outputs": [
87
+ {
88
+ "name": "stdout",
89
+ "output_type": "stream",
90
+ "text": [
91
+ "Loading dataset with latin1 encoding (this might take a minute)...\n"
92
+ ]
93
+ },
94
+ {
95
+ "name": "stderr",
96
+ "output_type": "stream",
97
+ "text": [
98
+ "/var/folders/_b/4pz7ygss6y7c564mcqz3grmw0000gn/T/ipykernel_3677/1757211552.py:6: VisibleDeprecationWarning: dtype(): align should be passed as Python or NumPy boolean but got `align=0`. Did you mean to pass a tuple to create a subarray type? (Deprecated NumPy 2.4)\n",
99
+ " df = pickle.load(f, encoding='latin1')\n"
100
+ ]
101
+ },
102
+ {
103
+ "name": "stdout",
104
+ "output_type": "stream",
105
+ "text": [
106
+ "Success! Total wafers in dataset: 811457\n"
107
+ ]
108
+ }
109
+ ],
110
+ "source": [
111
+ "#Making the file path and loading the file in this venv\n",
112
+ "file_path = os.path.join(path, 'LSWMD.pkl')\n",
113
+ "\n",
114
+ "print(\"Loading dataset with latin1 encoding (this might take a minute)...\")\n",
115
+ "with open(file_path, 'rb') as f:\n",
116
+ " df = pickle.load(f, encoding='latin1')\n",
117
+ "print(f\"Success! Total wafers in dataset: {len(df)}\")"
118
+ ]
119
+ },
120
+ {
121
+ "cell_type": "code",
122
+ "execution_count": 14,
123
+ "id": "bd97a8e5",
124
+ "metadata": {},
125
+ "outputs": [
126
+ {
127
+ "name": "stdout",
128
+ "output_type": "stream",
129
+ "text": [
130
+ "Data cleaned! Now you can view it.\n"
131
+ ]
132
+ }
133
+ ],
134
+ "source": [
135
+ "# 1. Clean the nested failure type column to create 'failure_class'\n",
136
+ "df['failure_class'] = df['failureType'].apply(lambda x: x[0][0] if len(x) > 0 else 'None')\n",
137
+ "\n",
138
+ "# 2. Filter out the perfect wafers to create the 'defective_wafers' subset\n",
139
+ "defective_wafers = df[(df['failure_class'] != 'None') & (df['failure_class'] != 'none')]\n",
140
+ "\n",
141
+ "print(\"Data cleaned! Now you can view it.\")"
142
+ ]
143
+ },
144
+ {
145
+ "cell_type": "code",
146
+ "execution_count": null,
147
+ "id": "780e3a68",
148
+ "metadata": {},
149
+ "outputs": [
150
+ {
151
+ "name": "stdout",
152
+ "output_type": "stream",
153
+ "text": [
154
+ "Top 5 rows of the dataset:\n"
155
+ ]
156
+ },
157
+ {
158
+ "data": {
159
+ "text/html": [
160
+ "<div>\n",
161
+ "<style scoped>\n",
162
+ " .dataframe tbody tr th:only-of-type {\n",
163
+ " vertical-align: middle;\n",
164
+ " }\n",
165
+ "\n",
166
+ " .dataframe tbody tr th {\n",
167
+ " vertical-align: top;\n",
168
+ " }\n",
169
+ "\n",
170
+ " .dataframe thead th {\n",
171
+ " text-align: right;\n",
172
+ " }\n",
173
+ "</style>\n",
174
+ "<table border=\"1\" class=\"dataframe\">\n",
175
+ " <thead>\n",
176
+ " <tr style=\"text-align: right;\">\n",
177
+ " <th></th>\n",
178
+ " <th>waferMap</th>\n",
179
+ " <th>dieSize</th>\n",
180
+ " <th>failure_class</th>\n",
181
+ " </tr>\n",
182
+ " </thead>\n",
183
+ " <tbody>\n",
184
+ " <tr>\n",
185
+ " <th>0</th>\n",
186
+ " <td>[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...</td>\n",
187
+ " <td>1683.0</td>\n",
188
+ " <td>none</td>\n",
189
+ " </tr>\n",
190
+ " <tr>\n",
191
+ " <th>1</th>\n",
192
+ " <td>[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...</td>\n",
193
+ " <td>1683.0</td>\n",
194
+ " <td>none</td>\n",
195
+ " </tr>\n",
196
+ " <tr>\n",
197
+ " <th>2</th>\n",
198
+ " <td>[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...</td>\n",
199
+ " <td>1683.0</td>\n",
200
+ " <td>none</td>\n",
201
+ " </tr>\n",
202
+ " <tr>\n",
203
+ " <th>3</th>\n",
204
+ " <td>[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...</td>\n",
205
+ " <td>1683.0</td>\n",
206
+ " <td>none</td>\n",
207
+ " </tr>\n",
208
+ " <tr>\n",
209
+ " <th>4</th>\n",
210
+ " <td>[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...</td>\n",
211
+ " <td>1683.0</td>\n",
212
+ " <td>none</td>\n",
213
+ " </tr>\n",
214
+ " </tbody>\n",
215
+ "</table>\n",
216
+ "</div>"
217
+ ],
218
+ "text/plain": [
219
+ " waferMap dieSize failure_class\n",
220
+ "0 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,... 1683.0 none\n",
221
+ "1 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,... 1683.0 none\n",
222
+ "2 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,... 1683.0 none\n",
223
+ "3 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,... 1683.0 none\n",
224
+ "4 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,... 1683.0 none"
225
+ ]
226
+ },
227
+ "metadata": {},
228
+ "output_type": "display_data"
229
+ },
230
+ {
231
+ "name": "stdout",
232
+ "output_type": "stream",
233
+ "text": [
234
+ "\n",
235
+ "Shape of the first defective wafer array: (45, 48)\n",
236
+ "\n",
237
+ "The raw 2D array data (Notice the 0s, 1s, and 2s):\n",
238
+ "[[0 0 0 ... 0 0 0]\n",
239
+ " [0 0 0 ... 0 0 0]\n",
240
+ " [0 0 0 ... 0 0 0]\n",
241
+ " ...\n",
242
+ " [0 0 0 ... 0 0 0]\n",
243
+ " [0 0 0 ... 0 0 0]\n",
244
+ " [0 0 0 ... 0 0 0]]\n"
245
+ ]
246
+ }
247
+ ],
248
+ "source": [
249
+ "#first look at the data\n",
250
+ "print(\"Top 5 rows of the dataset:\")\n",
251
+ "display(df[['waferMap', 'dieSize', 'failure_class']].head())\n",
252
+ "\n",
253
+ "# Look at exactly what the 2D array looks like for the first defective wafer\n",
254
+ "first_defect_index = defective_wafers.index[0]\n",
255
+ "first_defect_array = defective_wafers.loc[first_defect_index, 'waferMap']\n",
256
+ "\n",
257
+ "print(\"\\nShape of the first defective wafer array:\", first_defect_array.shape)\n",
258
+ "print(\"\\nThe raw 2D array data (Notice the 0s, 1s, and 2s):\")\n",
259
+ "print(first_defect_array)"
260
+ ]
261
+ },
262
+ {
263
+ "cell_type": "code",
264
+ "execution_count": 17,
265
+ "id": "899f308a",
266
+ "metadata": {},
267
+ "outputs": [
268
+ {
269
+ "name": "stdout",
270
+ "output_type": "stream",
271
+ "text": [
272
+ "The FULL raw 2D array data:\n",
273
+ "[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 2 1 1 2 0 0 0 0 0 0 0 0 0\n",
274
+ " 0 0 0 0 0 0 0 0 0 0 0 0]\n",
275
+ " [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0\n",
276
+ " 0 0 0 0 0 0 0 0 0 0 0 0]\n",
277
+ " [0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0\n",
278
+ " 0 0 0 0 0 0 0 0 0 0 0 0]\n",
279
+ " [0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1\n",
280
+ " 0 0 0 0 0 0 0 0 0 0 0 0]\n",
281
+ " [0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
282
+ " 1 2 0 0 0 0 0 0 0 0 0 0]\n",
283
+ " [0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
284
+ " 1 1 1 0 0 0 0 0 0 0 0 0]\n",
285
+ " [0 0 0 0 0 0 0 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2\n",
286
+ " 1 1 1 2 0 0 0 0 0 0 0 0]\n",
287
+ " [0 0 0 0 0 0 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
288
+ " 1 1 1 1 1 0 0 0 0 0 0 0]\n",
289
+ " [0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 1\n",
290
+ " 1 2 1 1 1 1 0 0 0 0 0 0]\n",
291
+ " [0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 1\n",
292
+ " 1 1 1 1 1 1 1 0 0 0 0 0]\n",
293
+ " [0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2\n",
294
+ " 1 1 1 1 1 1 2 1 0 0 0 0]\n",
295
+ " [0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1\n",
296
+ " 1 1 1 1 2 2 1 1 0 0 0 0]\n",
297
+ " [0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
298
+ " 1 2 2 2 2 1 1 2 1 0 0 0]\n",
299
+ " [0 0 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1\n",
300
+ " 2 2 2 2 2 1 1 1 1 2 0 0]\n",
301
+ " [0 0 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 1 1 1 1 1 2 2\n",
302
+ " 2 2 2 2 2 2 1 1 1 2 0 0]\n",
303
+ " [0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2\n",
304
+ " 2 2 2 2 2 2 2 2 2 1 0 0]\n",
305
+ " [0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2\n",
306
+ " 2 2 2 2 1 1 1 1 1 1 1 0]\n",
307
+ " [0 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2\n",
308
+ " 2 2 2 2 2 2 2 2 1 1 1 0]\n",
309
+ " [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1\n",
310
+ " 2 2 2 2 2 2 2 1 2 1 1 0]\n",
311
+ " [1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1\n",
312
+ " 1 1 1 2 1 1 1 1 1 1 1 2]\n",
313
+ " [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
314
+ " 2 1 1 1 1 1 1 1 1 1 1 2]\n",
315
+ " [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1\n",
316
+ " 1 1 1 1 1 1 1 1 1 1 1 2]\n",
317
+ " [2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1\n",
318
+ " 1 1 1 2 1 1 1 2 1 1 2 2]\n",
319
+ " [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
320
+ " 1 1 1 1 1 1 1 2 1 1 1 2]\n",
321
+ " [2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1\n",
322
+ " 1 1 1 2 1 1 1 1 1 1 1 2]\n",
323
+ " [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
324
+ " 1 1 2 1 2 1 1 1 1 1 1 2]\n",
325
+ " [2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
326
+ " 1 1 1 1 1 1 1 1 1 1 1 0]\n",
327
+ " [2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
328
+ " 1 1 1 1 1 1 1 1 1 1 2 0]\n",
329
+ " [0 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2\n",
330
+ " 1 1 1 1 1 1 1 1 1 1 1 0]\n",
331
+ " [0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1\n",
332
+ " 1 1 1 1 1 1 1 1 1 1 1 0]\n",
333
+ " [0 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
334
+ " 1 1 1 1 1 1 1 1 1 1 0 0]\n",
335
+ " [0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
336
+ " 1 1 1 1 1 1 1 1 1 1 0 0]\n",
337
+ " [0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
338
+ " 1 1 1 1 1 1 1 1 1 0 0 0]\n",
339
+ " [0 0 0 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
340
+ " 1 1 1 1 1 1 1 1 1 0 0 0]\n",
341
+ " [0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
342
+ " 1 1 1 1 1 1 1 1 0 0 0 0]\n",
343
+ " [0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
344
+ " 1 1 1 1 1 1 1 0 0 0 0 0]\n",
345
+ " [0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2\n",
346
+ " 1 1 1 1 1 1 1 0 0 0 0 0]\n",
347
+ " [0 0 0 0 0 0 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
348
+ " 1 1 1 2 1 1 0 0 0 0 0 0]\n",
349
+ " [0 0 0 0 0 0 0 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1\n",
350
+ " 1 1 2 1 1 0 0 0 0 0 0 0]\n",
351
+ " [0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n",
352
+ " 1 1 1 1 0 0 0 0 0 0 0 0]\n",
353
+ " [0 0 0 0 0 0 0 0 0 2 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1\n",
354
+ " 2 1 0 0 0 0 0 0 0 0 0 0]\n",
355
+ " [0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1\n",
356
+ " 2 0 0 0 0 0 0 0 0 0 0 0]\n",
357
+ " [0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0\n",
358
+ " 0 0 0 0 0 0 0 0 0 0 0 0]\n",
359
+ " [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0\n",
360
+ " 0 0 0 0 0 0 0 0 0 0 0 0]\n",
361
+ " [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 2 0 0 0 0 0 0\n",
362
+ " 0 0 0 0 0 0 0 0 0 0 0 0]]\n"
363
+ ]
364
+ }
365
+ ],
366
+ "source": [
367
+ "# Force NumPy to print the entire array without truncation\n",
368
+ "np.set_printoptions(threshold=10000)\n",
369
+ "\n",
370
+ "print(\"The FULL raw 2D array data:\")\n",
371
+ "print(first_defect_array)\n",
372
+ "\n",
373
+ "# Reset it back to default right after so future arrays don't break your terminal\n",
374
+ "np.set_printoptions(threshold=1000)"
375
+ ]
376
+ }
377
+ ],
378
+ "metadata": {
379
+ "kernelspec": {
380
+ "display_name": "wafer_gpu",
381
+ "language": "python",
382
+ "name": "python3"
383
+ },
384
+ "language_info": {
385
+ "codemirror_mode": {
386
+ "name": "ipython",
387
+ "version": 3
388
+ },
389
+ "file_extension": ".py",
390
+ "mimetype": "text/x-python",
391
+ "name": "python",
392
+ "nbconvert_exporter": "python",
393
+ "pygments_lexer": "ipython3",
394
+ "version": "3.9.25"
395
+ }
396
+ },
397
+ "nbformat": 4,
398
+ "nbformat_minor": 5
399
+ }
requirements.txt ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Data handling & EDA
2
+ kagglehub
3
+ pandas
4
+ numpy
5
+ matplotlib
6
+ jupyter
7
+
8
+ # Computer Vision & AI Tools
9
+ opencv-python
10
+ ultralytics
11
+ scikit-learn
12
+
13
+ # Web Application & AI Chatbot
14
+ fastapi
15
+ uvicorn[standard]
16
+ google-genai
src/__init__.py ADDED
File without changes
src/batch_inference.py ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from ultralytics import YOLO
2
+
3
+ def test_all_validation_images():
4
+ print("Loading your custom-trained wafer brain...")
5
+ # Pointing to your exact model path from earlier
6
+ model_path = 'middleware/best.pt'
7
+ model = YOLO(model_path)
8
+
9
+ print("Running inference on ALL validation images...")
10
+ # Instead of one image, we hand it the entire validation folder
11
+ val_dir = 'data/yolo_dataset/images/val'
12
+
13
+ # The AI will automatically loop through all 5,000+ images!
14
+ results = model.predict(source=val_dir, save=True, conf=0.25)
15
+
16
+ print("\nMassive inference complete! Look in the newest 'predict' folder to see thousands of drawn bounding boxes.")
17
+
18
+ if __name__ == '__main__':
19
+ test_all_validation_images()