Upload README.md

3051420 verified 15 days ago

9.78 kB

	# ALWAS ML Models — Analog Layout Workflow Automation System

	> 4 production-ready ML models for the ALWAS system. Replaces the Groq LLM API dependency with faster, free, local inference.

	## 🎯 Models

	\| Model \| Task \| Metric \| Value \|
	\|-------\|------\|--------\|-------\|
	\| Hours Estimator \| Predict layout hours from block metadata \| R² / MAE \| 0.881 / 5.78h \|
	\| Complexity Classifier \| Classify Low/Medium/High complexity \| Accuracy / F1 \| 91.7% / 0.917 \|
	\| Bottleneck Predictor \| Detect blocks at risk of getting stuck \| Accuracy / F1 \| 99.6% / 0.996 \|
	\| Completion Predictor \| Predict remaining hours to completion \| R² / MAE \| 0.945 / 1.65h \|

	## 🏗️ Architecture

	```
	┌─────────────────────────────────────────────────────────────────┐
	│ ALWAS ML Pipeline │
	├─────────────────────────────────────────────────────────────────┤
	│ │
	│ Block Created ──► Hours Estimator (XGBoost) ──► Est. Hours │
	│ ──► Complexity Classifier (XGB+LGB) ──► Class │
	│ │
	│ Block In-Progress ──► Bottleneck Predictor ──► Risk Alert │
	│ ──► Completion Predictor ──► ETA │
	│ │
	│ Hourly Cron ──► Batch Bottleneck Scan ──► Notifications │
	│ │
	└─────────────────────────────────────────────────────────────────┘
	```

	## 🚀 Quick Start

	### Python (Direct)

	```python
	import joblib
	import numpy as np

	# Load models
	hours_model = joblib.load('models/hours_estimator.joblib')
	complexity_xgb = joblib.load('models/complexity_xgb.joblib')
	complexity_lgb = joblib.load('models/complexity_lgb.joblib')
	bottleneck_model = joblib.load('models/bottleneck_predictor.joblib')
	completion_model = joblib.load('models/completion_predictor.joblib')

	# Load encoders
	tech_node_encoder = joblib.load('models/tech_node_encoder.joblib')
	block_type_encoder = joblib.load('models/block_type_encoder.joblib')
	```

	### REST API

	```bash
	# Install
	pip install fastapi uvicorn joblib xgboost lightgbm scikit-learn numpy

	# Run
	MODEL_DIR=./models python inference_server.py

	# Call
	curl -X POST http://localhost:7860/predict/estimate \
	-H "Content-Type: application/json" \
	-d '{
	"block_type": "PLL",
	"tech_node": "7nm",
	"priority": "P1-Critical",
	"transistor_count": 80000,
	"has_dependencies": true,
	"num_dependencies": 3,
	"constraint_complexity": 2.5,
	"drc_iterations": 4
	}'
	```

	Response:
	```json
	{
	"complexity": "High",
	"estimated_hours": 89.0,
	"confidence": 0.996,
	"risk_level": "high",
	"reasoning": "Advanced 7nm node requires extensive DRC/LVS iterations...",
	"recommended_drc_iterations": 4,
	"suggested_engineer_skill_level": "senior",
	"complexity_probabilities": {"High": 0.996, "Low": 0.0, "Medium": 0.003},
	"estimated_days": 11.1
	}
	```

	## 📡 API Endpoints

	\| Method \| Endpoint \| Description \|
	\|--------\|----------\|-------------\|
	\| `POST` \| `/predict/estimate` \| Complexity & hours estimation (replaces Groq) \|
	\| `POST` \| `/predict/bottleneck` \| Bottleneck risk prediction \|
	\| `POST` \| `/predict/completion` \| Completion time prediction \|
	\| `POST` \| `/predict/bulk-estimate` \| Bulk estimation (up to 200 blocks) \|
	\| `GET` \| `/model/metrics` \| Model performance metrics \|
	\| `GET` \| `/model/supported-values` \| Supported block types, tech nodes, etc. \|
	\| `GET` \| `/health` \| Health check \|

	## 🔌 ALWAS Integration

	### Replace Groq API in Express.js

	Before (server/routes/blocks.js):
	```javascript
	// Old: Groq LLM call ($0.002/request, 300ms latency)
	const response = await groq.chat.completions.create({
	model: "llama-3.3-70b-versatile",
	messages: [{ role: "user", content: prompt }]
	});
	```

	After (using ALWAS ML API):
	```javascript
	// New: Local ML model (free, <5ms latency)
	const response = await fetch('http://localhost:7860/predict/estimate', {
	method: 'POST',
	headers: { 'Content-Type': 'application/json' },
	body: JSON.stringify({
	block_type: block.type,
	tech_node: block.techNode,
	priority: block.priority,
	transistor_count: block.transistorCount,
	has_dependencies: block.dependencies?.length > 0,
	num_dependencies: block.dependencies?.length \|\| 0,
	constraint_complexity: block.constraintComplexity \|\| 1.0,
	drc_iterations: block.drcIterations \|\| 2
	})
	});
	const estimate = await response.json();
	```

	### Add Bottleneck Scanning to Cron Job

	```javascript
	// In server/cron/bottleneckScanner.js
	const blocks = await Block.find({ status: { $ne: 'Completed' } });

	for (const block of blocks) {
	const risk = await fetch('http://localhost:7860/predict/bottleneck', {
	method: 'POST',
	headers: { 'Content-Type': 'application/json' },
	body: JSON.stringify({
	block_type: block.type,
	tech_node: block.techNode,
	estimated_hours: block.estimatedHours,
	hours_logged: block.hoursLogged,
	current_stage: block.status,
	days_in_current_stage: daysSinceLastTransition(block),
	drc_violations_total: block.drcViolations,
	is_overdue: new Date() > block.dueDate
	})
	});
	const result = await risk.json();

	if (result.should_alert) {
	// Create notification for manager
	await Notification.create({
	type: 'stuck',
	message: `ML Alert: ${block.name} has HIGH bottleneck risk`,
	recommendations: result.recommendations
	});
	io.emit('newNotification', { blockId: block._id, risk: result });
	}
	}
	```

	### Add Completion ETA to Block Detail

	```javascript
	// In GET /api/blocks/:id
	const completion = await fetch('http://localhost:7860/predict/completion', {
	method: 'POST',
	headers: { 'Content-Type': 'application/json' },
	body: JSON.stringify({
	block_type: block.type,
	tech_node: block.techNode,
	estimated_hours: block.estimatedHours,
	current_stage: block.status,
	cumulative_hours: block.hoursLogged,
	cumulative_days: daysSinceStart(block),
	cumulative_drc_violations: block.drcViolations
	})
	});
	const eta = await completion.json();
	// eta.remaining_hours, eta.estimated_completion_date, eta.progress_percent
	```

	## 📊 Supported Values

	### Block Types (20)
	ADC, BGR, BandgapRef, Comparator, CurrentMirror, DAC, DiffAmp, LDO, LNA, LVDS_Driver, Mixer, OTA, Oscillator, PA, PLL, PowerDetector, SampleHold, SerDes, TIA, VCO

	### Technology Nodes (8)
	5nm, 7nm, 12nm, 14nm, 22nm, 28nm, 45nm, 65nm

	### Pipeline Stages (7)
	Not Started → In Progress → DRC → LVS → ERC → Review → Completed

	## 📈 Feature Importance

	### Hours Estimation — Top Features
	1. `transistor_count_log` (31.5%) — Most predictive: larger blocks take longer
	2. `transistor_count` (28.6%) — Raw count captures non-log relationships
	3. `engineer_skill_factor` (7.7%) — Skill level matters significantly
	4. `tech_node_encoded` (6.8%) — Advanced nodes are harder
	5. `constraint_complexity` (2.7%) — Analog constraints add overhead

	### Completion Prediction — Top Features
	1. `current_stage_idx` (44.9%) — Current stage is the strongest signal
	2. `stages_completed` (22.3%) — Progress through pipeline
	3. `avg_hours_per_stage_so_far` (21.0%) — Pace of work predicts future

	## 🔧 Retraining

	```bash
	# Generate new training data from ALWAS MongoDB exports
	python training/generate_dataset.py

	# Train all models
	python training/train_models.py
	python training/train_completion.py
	```

	Recommended retraining schedule: Monthly, or when >100 new completed blocks accumulate.

	## 📦 Files

	```
	models/
	hours_estimator.joblib # XGBoost regressor
	complexity_xgb.joblib # XGBoost classifier (ensemble member)
	complexity_lgb.joblib # LightGBM classifier (ensemble member)
	bottleneck_predictor.joblib # Calibrated XGBoost classifier
	completion_predictor.joblib # XGBoost regressor for remaining time
	tech_node_encoder.joblib # LabelEncoder
	block_type_encoder.joblib # LabelEncoder
	priority_encoder.joblib # OrdinalEncoder
	complexity_encoder.joblib # LabelEncoder
	bottleneck_encoder.joblib # LabelEncoder
	feature_config.json # Feature lists and supported values
	metrics.json # Model evaluation metrics
	inference_server.py # FastAPI inference server
	training/
	generate_dataset.py # Synthetic data generator
	train_models.py # Model training (Models 1-3)
	train_completion.py # Completion model training (Model 4)
	```

	## 📐 Performance vs Groq API

	\| Metric \| Groq llama-3.3-70b \| ALWAS ML Models \|
	\|--------\|---------------------\|-----------------\|
	\| Latency \| ~300ms \| <5ms \|
	\| Cost per request \| $0.002 \| Free \|
	\| Internet required \| Yes \| No \|
	\| Structured output \| Sometimes \| Always (JSON guaranteed) \|
	\| Batch support \| Limited \| 200 blocks/call \|
	\| Bottleneck detection \| No \| Yes (real-time) \|
	\| Completion prediction \| No \| Yes (R²=0.945) \|
	\| Explainability \| LLM narrative \| Feature importance + reasoning \|

	## License
	MIT — Built for EPIC Build-A-Thon 2026 \| Epical Layouts Pvt. Ltd.