alwas-ml-models / README.md
muthuk1's picture
Upload README.md
3051420 verified
# ALWAS ML Models β€” Analog Layout Workflow Automation System
> **4 production-ready ML models** for the ALWAS system. Replaces the Groq LLM API dependency with faster, free, local inference.
## 🎯 Models
| Model | Task | Metric | Value |
|-------|------|--------|-------|
| **Hours Estimator** | Predict layout hours from block metadata | RΒ² / MAE | 0.881 / 5.78h |
| **Complexity Classifier** | Classify Low/Medium/High complexity | Accuracy / F1 | 91.7% / 0.917 |
| **Bottleneck Predictor** | Detect blocks at risk of getting stuck | Accuracy / F1 | 99.6% / 0.996 |
| **Completion Predictor** | Predict remaining hours to completion | RΒ² / MAE | 0.945 / 1.65h |
## πŸ—οΈ Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ALWAS ML Pipeline β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ β”‚
β”‚ Block Created ──► Hours Estimator (XGBoost) ──► Est. Hours β”‚
β”‚ ──► Complexity Classifier (XGB+LGB) ──► Class β”‚
β”‚ β”‚
β”‚ Block In-Progress ──► Bottleneck Predictor ──► Risk Alert β”‚
β”‚ ──► Completion Predictor ──► ETA β”‚
β”‚ β”‚
β”‚ Hourly Cron ──► Batch Bottleneck Scan ──► Notifications β”‚
β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## πŸš€ Quick Start
### Python (Direct)
```python
import joblib
import numpy as np
# Load models
hours_model = joblib.load('models/hours_estimator.joblib')
complexity_xgb = joblib.load('models/complexity_xgb.joblib')
complexity_lgb = joblib.load('models/complexity_lgb.joblib')
bottleneck_model = joblib.load('models/bottleneck_predictor.joblib')
completion_model = joblib.load('models/completion_predictor.joblib')
# Load encoders
tech_node_encoder = joblib.load('models/tech_node_encoder.joblib')
block_type_encoder = joblib.load('models/block_type_encoder.joblib')
```
### REST API
```bash
# Install
pip install fastapi uvicorn joblib xgboost lightgbm scikit-learn numpy
# Run
MODEL_DIR=./models python inference_server.py
# Call
curl -X POST http://localhost:7860/predict/estimate \
-H "Content-Type: application/json" \
-d '{
"block_type": "PLL",
"tech_node": "7nm",
"priority": "P1-Critical",
"transistor_count": 80000,
"has_dependencies": true,
"num_dependencies": 3,
"constraint_complexity": 2.5,
"drc_iterations": 4
}'
```
Response:
```json
{
"complexity": "High",
"estimated_hours": 89.0,
"confidence": 0.996,
"risk_level": "high",
"reasoning": "Advanced 7nm node requires extensive DRC/LVS iterations...",
"recommended_drc_iterations": 4,
"suggested_engineer_skill_level": "senior",
"complexity_probabilities": {"High": 0.996, "Low": 0.0, "Medium": 0.003},
"estimated_days": 11.1
}
```
## πŸ“‘ API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/predict/estimate` | Complexity & hours estimation (replaces Groq) |
| `POST` | `/predict/bottleneck` | Bottleneck risk prediction |
| `POST` | `/predict/completion` | Completion time prediction |
| `POST` | `/predict/bulk-estimate` | Bulk estimation (up to 200 blocks) |
| `GET` | `/model/metrics` | Model performance metrics |
| `GET` | `/model/supported-values` | Supported block types, tech nodes, etc. |
| `GET` | `/health` | Health check |
## πŸ”Œ ALWAS Integration
### Replace Groq API in Express.js
**Before** (server/routes/blocks.js):
```javascript
// Old: Groq LLM call ($0.002/request, 300ms latency)
const response = await groq.chat.completions.create({
model: "llama-3.3-70b-versatile",
messages: [{ role: "user", content: prompt }]
});
```
**After** (using ALWAS ML API):
```javascript
// New: Local ML model (free, <5ms latency)
const response = await fetch('http://localhost:7860/predict/estimate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
block_type: block.type,
tech_node: block.techNode,
priority: block.priority,
transistor_count: block.transistorCount,
has_dependencies: block.dependencies?.length > 0,
num_dependencies: block.dependencies?.length || 0,
constraint_complexity: block.constraintComplexity || 1.0,
drc_iterations: block.drcIterations || 2
})
});
const estimate = await response.json();
```
### Add Bottleneck Scanning to Cron Job
```javascript
// In server/cron/bottleneckScanner.js
const blocks = await Block.find({ status: { $ne: 'Completed' } });
for (const block of blocks) {
const risk = await fetch('http://localhost:7860/predict/bottleneck', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
block_type: block.type,
tech_node: block.techNode,
estimated_hours: block.estimatedHours,
hours_logged: block.hoursLogged,
current_stage: block.status,
days_in_current_stage: daysSinceLastTransition(block),
drc_violations_total: block.drcViolations,
is_overdue: new Date() > block.dueDate
})
});
const result = await risk.json();
if (result.should_alert) {
// Create notification for manager
await Notification.create({
type: 'stuck',
message: `ML Alert: ${block.name} has HIGH bottleneck risk`,
recommendations: result.recommendations
});
io.emit('newNotification', { blockId: block._id, risk: result });
}
}
```
### Add Completion ETA to Block Detail
```javascript
// In GET /api/blocks/:id
const completion = await fetch('http://localhost:7860/predict/completion', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
block_type: block.type,
tech_node: block.techNode,
estimated_hours: block.estimatedHours,
current_stage: block.status,
cumulative_hours: block.hoursLogged,
cumulative_days: daysSinceStart(block),
cumulative_drc_violations: block.drcViolations
})
});
const eta = await completion.json();
// eta.remaining_hours, eta.estimated_completion_date, eta.progress_percent
```
## πŸ“Š Supported Values
### Block Types (20)
ADC, BGR, BandgapRef, Comparator, CurrentMirror, DAC, DiffAmp, LDO, LNA, LVDS_Driver, Mixer, OTA, Oscillator, PA, PLL, PowerDetector, SampleHold, SerDes, TIA, VCO
### Technology Nodes (8)
5nm, 7nm, 12nm, 14nm, 22nm, 28nm, 45nm, 65nm
### Pipeline Stages (7)
Not Started β†’ In Progress β†’ DRC β†’ LVS β†’ ERC β†’ Review β†’ Completed
## πŸ“ˆ Feature Importance
### Hours Estimation β€” Top Features
1. `transistor_count_log` (31.5%) β€” Most predictive: larger blocks take longer
2. `transistor_count` (28.6%) β€” Raw count captures non-log relationships
3. `engineer_skill_factor` (7.7%) β€” Skill level matters significantly
4. `tech_node_encoded` (6.8%) β€” Advanced nodes are harder
5. `constraint_complexity` (2.7%) β€” Analog constraints add overhead
### Completion Prediction β€” Top Features
1. `current_stage_idx` (44.9%) β€” Current stage is the strongest signal
2. `stages_completed` (22.3%) β€” Progress through pipeline
3. `avg_hours_per_stage_so_far` (21.0%) β€” Pace of work predicts future
## πŸ”§ Retraining
```bash
# Generate new training data from ALWAS MongoDB exports
python training/generate_dataset.py
# Train all models
python training/train_models.py
python training/train_completion.py
```
**Recommended retraining schedule:** Monthly, or when >100 new completed blocks accumulate.
## πŸ“¦ Files
```
models/
hours_estimator.joblib # XGBoost regressor
complexity_xgb.joblib # XGBoost classifier (ensemble member)
complexity_lgb.joblib # LightGBM classifier (ensemble member)
bottleneck_predictor.joblib # Calibrated XGBoost classifier
completion_predictor.joblib # XGBoost regressor for remaining time
tech_node_encoder.joblib # LabelEncoder
block_type_encoder.joblib # LabelEncoder
priority_encoder.joblib # OrdinalEncoder
complexity_encoder.joblib # LabelEncoder
bottleneck_encoder.joblib # LabelEncoder
feature_config.json # Feature lists and supported values
metrics.json # Model evaluation metrics
inference_server.py # FastAPI inference server
training/
generate_dataset.py # Synthetic data generator
train_models.py # Model training (Models 1-3)
train_completion.py # Completion model training (Model 4)
```
## πŸ“ Performance vs Groq API
| Metric | Groq llama-3.3-70b | ALWAS ML Models |
|--------|---------------------|-----------------|
| Latency | ~300ms | <5ms |
| Cost per request | $0.002 | Free |
| Internet required | Yes | No |
| Structured output | Sometimes | Always (JSON guaranteed) |
| Batch support | Limited | 200 blocks/call |
| Bottleneck detection | No | Yes (real-time) |
| Completion prediction | No | Yes (RΒ²=0.945) |
| Explainability | LLM narrative | Feature importance + reasoning |
## License
MIT β€” Built for EPIC Build-A-Thon 2026 | Epical Layouts Pvt. Ltd.