File size: 9,778 Bytes
3051420 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 | # ALWAS ML Models β Analog Layout Workflow Automation System
> **4 production-ready ML models** for the ALWAS system. Replaces the Groq LLM API dependency with faster, free, local inference.
## π― Models
| Model | Task | Metric | Value |
|-------|------|--------|-------|
| **Hours Estimator** | Predict layout hours from block metadata | RΒ² / MAE | 0.881 / 5.78h |
| **Complexity Classifier** | Classify Low/Medium/High complexity | Accuracy / F1 | 91.7% / 0.917 |
| **Bottleneck Predictor** | Detect blocks at risk of getting stuck | Accuracy / F1 | 99.6% / 0.996 |
| **Completion Predictor** | Predict remaining hours to completion | RΒ² / MAE | 0.945 / 1.65h |
## ποΈ Architecture
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ALWAS ML Pipeline β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Block Created βββΊ Hours Estimator (XGBoost) βββΊ Est. Hours β
β βββΊ Complexity Classifier (XGB+LGB) βββΊ Class β
β β
β Block In-Progress βββΊ Bottleneck Predictor βββΊ Risk Alert β
β βββΊ Completion Predictor βββΊ ETA β
β β
β Hourly Cron βββΊ Batch Bottleneck Scan βββΊ Notifications β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
## π Quick Start
### Python (Direct)
```python
import joblib
import numpy as np
# Load models
hours_model = joblib.load('models/hours_estimator.joblib')
complexity_xgb = joblib.load('models/complexity_xgb.joblib')
complexity_lgb = joblib.load('models/complexity_lgb.joblib')
bottleneck_model = joblib.load('models/bottleneck_predictor.joblib')
completion_model = joblib.load('models/completion_predictor.joblib')
# Load encoders
tech_node_encoder = joblib.load('models/tech_node_encoder.joblib')
block_type_encoder = joblib.load('models/block_type_encoder.joblib')
```
### REST API
```bash
# Install
pip install fastapi uvicorn joblib xgboost lightgbm scikit-learn numpy
# Run
MODEL_DIR=./models python inference_server.py
# Call
curl -X POST http://localhost:7860/predict/estimate \
-H "Content-Type: application/json" \
-d '{
"block_type": "PLL",
"tech_node": "7nm",
"priority": "P1-Critical",
"transistor_count": 80000,
"has_dependencies": true,
"num_dependencies": 3,
"constraint_complexity": 2.5,
"drc_iterations": 4
}'
```
Response:
```json
{
"complexity": "High",
"estimated_hours": 89.0,
"confidence": 0.996,
"risk_level": "high",
"reasoning": "Advanced 7nm node requires extensive DRC/LVS iterations...",
"recommended_drc_iterations": 4,
"suggested_engineer_skill_level": "senior",
"complexity_probabilities": {"High": 0.996, "Low": 0.0, "Medium": 0.003},
"estimated_days": 11.1
}
```
## π‘ API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/predict/estimate` | Complexity & hours estimation (replaces Groq) |
| `POST` | `/predict/bottleneck` | Bottleneck risk prediction |
| `POST` | `/predict/completion` | Completion time prediction |
| `POST` | `/predict/bulk-estimate` | Bulk estimation (up to 200 blocks) |
| `GET` | `/model/metrics` | Model performance metrics |
| `GET` | `/model/supported-values` | Supported block types, tech nodes, etc. |
| `GET` | `/health` | Health check |
## π ALWAS Integration
### Replace Groq API in Express.js
**Before** (server/routes/blocks.js):
```javascript
// Old: Groq LLM call ($0.002/request, 300ms latency)
const response = await groq.chat.completions.create({
model: "llama-3.3-70b-versatile",
messages: [{ role: "user", content: prompt }]
});
```
**After** (using ALWAS ML API):
```javascript
// New: Local ML model (free, <5ms latency)
const response = await fetch('http://localhost:7860/predict/estimate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
block_type: block.type,
tech_node: block.techNode,
priority: block.priority,
transistor_count: block.transistorCount,
has_dependencies: block.dependencies?.length > 0,
num_dependencies: block.dependencies?.length || 0,
constraint_complexity: block.constraintComplexity || 1.0,
drc_iterations: block.drcIterations || 2
})
});
const estimate = await response.json();
```
### Add Bottleneck Scanning to Cron Job
```javascript
// In server/cron/bottleneckScanner.js
const blocks = await Block.find({ status: { $ne: 'Completed' } });
for (const block of blocks) {
const risk = await fetch('http://localhost:7860/predict/bottleneck', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
block_type: block.type,
tech_node: block.techNode,
estimated_hours: block.estimatedHours,
hours_logged: block.hoursLogged,
current_stage: block.status,
days_in_current_stage: daysSinceLastTransition(block),
drc_violations_total: block.drcViolations,
is_overdue: new Date() > block.dueDate
})
});
const result = await risk.json();
if (result.should_alert) {
// Create notification for manager
await Notification.create({
type: 'stuck',
message: `ML Alert: ${block.name} has HIGH bottleneck risk`,
recommendations: result.recommendations
});
io.emit('newNotification', { blockId: block._id, risk: result });
}
}
```
### Add Completion ETA to Block Detail
```javascript
// In GET /api/blocks/:id
const completion = await fetch('http://localhost:7860/predict/completion', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
block_type: block.type,
tech_node: block.techNode,
estimated_hours: block.estimatedHours,
current_stage: block.status,
cumulative_hours: block.hoursLogged,
cumulative_days: daysSinceStart(block),
cumulative_drc_violations: block.drcViolations
})
});
const eta = await completion.json();
// eta.remaining_hours, eta.estimated_completion_date, eta.progress_percent
```
## π Supported Values
### Block Types (20)
ADC, BGR, BandgapRef, Comparator, CurrentMirror, DAC, DiffAmp, LDO, LNA, LVDS_Driver, Mixer, OTA, Oscillator, PA, PLL, PowerDetector, SampleHold, SerDes, TIA, VCO
### Technology Nodes (8)
5nm, 7nm, 12nm, 14nm, 22nm, 28nm, 45nm, 65nm
### Pipeline Stages (7)
Not Started β In Progress β DRC β LVS β ERC β Review β Completed
## π Feature Importance
### Hours Estimation β Top Features
1. `transistor_count_log` (31.5%) β Most predictive: larger blocks take longer
2. `transistor_count` (28.6%) β Raw count captures non-log relationships
3. `engineer_skill_factor` (7.7%) β Skill level matters significantly
4. `tech_node_encoded` (6.8%) β Advanced nodes are harder
5. `constraint_complexity` (2.7%) β Analog constraints add overhead
### Completion Prediction β Top Features
1. `current_stage_idx` (44.9%) β Current stage is the strongest signal
2. `stages_completed` (22.3%) β Progress through pipeline
3. `avg_hours_per_stage_so_far` (21.0%) β Pace of work predicts future
## π§ Retraining
```bash
# Generate new training data from ALWAS MongoDB exports
python training/generate_dataset.py
# Train all models
python training/train_models.py
python training/train_completion.py
```
**Recommended retraining schedule:** Monthly, or when >100 new completed blocks accumulate.
## π¦ Files
```
models/
hours_estimator.joblib # XGBoost regressor
complexity_xgb.joblib # XGBoost classifier (ensemble member)
complexity_lgb.joblib # LightGBM classifier (ensemble member)
bottleneck_predictor.joblib # Calibrated XGBoost classifier
completion_predictor.joblib # XGBoost regressor for remaining time
tech_node_encoder.joblib # LabelEncoder
block_type_encoder.joblib # LabelEncoder
priority_encoder.joblib # OrdinalEncoder
complexity_encoder.joblib # LabelEncoder
bottleneck_encoder.joblib # LabelEncoder
feature_config.json # Feature lists and supported values
metrics.json # Model evaluation metrics
inference_server.py # FastAPI inference server
training/
generate_dataset.py # Synthetic data generator
train_models.py # Model training (Models 1-3)
train_completion.py # Completion model training (Model 4)
```
## π Performance vs Groq API
| Metric | Groq llama-3.3-70b | ALWAS ML Models |
|--------|---------------------|-----------------|
| Latency | ~300ms | <5ms |
| Cost per request | $0.002 | Free |
| Internet required | Yes | No |
| Structured output | Sometimes | Always (JSON guaranteed) |
| Batch support | Limited | 200 blocks/call |
| Bottleneck detection | No | Yes (real-time) |
| Completion prediction | No | Yes (RΒ²=0.945) |
| Explainability | LLM narrative | Feature importance + reasoning |
## License
MIT β Built for EPIC Build-A-Thon 2026 | Epical Layouts Pvt. Ltd.
|