| # ALWAS ML Models β Analog Layout Workflow Automation System |
|
|
| > **4 production-ready ML models** for the ALWAS system. Replaces the Groq LLM API dependency with faster, free, local inference. |
|
|
| ## π― Models |
|
|
| | Model | Task | Metric | Value | |
| |-------|------|--------|-------| |
| | **Hours Estimator** | Predict layout hours from block metadata | RΒ² / MAE | 0.881 / 5.78h | |
| | **Complexity Classifier** | Classify Low/Medium/High complexity | Accuracy / F1 | 91.7% / 0.917 | |
| | **Bottleneck Predictor** | Detect blocks at risk of getting stuck | Accuracy / F1 | 99.6% / 0.996 | |
| | **Completion Predictor** | Predict remaining hours to completion | RΒ² / MAE | 0.945 / 1.65h | |
|
|
| ## ποΈ Architecture |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β ALWAS ML Pipeline β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ |
| β β |
| β Block Created βββΊ Hours Estimator (XGBoost) βββΊ Est. Hours β |
| β βββΊ Complexity Classifier (XGB+LGB) βββΊ Class β |
| β β |
| β Block In-Progress βββΊ Bottleneck Predictor βββΊ Risk Alert β |
| β βββΊ Completion Predictor βββΊ ETA β |
| β β |
| β Hourly Cron βββΊ Batch Bottleneck Scan βββΊ Notifications β |
| β β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| ## π Quick Start |
|
|
| ### Python (Direct) |
|
|
| ```python |
| import joblib |
| import numpy as np |
| |
| # Load models |
| hours_model = joblib.load('models/hours_estimator.joblib') |
| complexity_xgb = joblib.load('models/complexity_xgb.joblib') |
| complexity_lgb = joblib.load('models/complexity_lgb.joblib') |
| bottleneck_model = joblib.load('models/bottleneck_predictor.joblib') |
| completion_model = joblib.load('models/completion_predictor.joblib') |
| |
| # Load encoders |
| tech_node_encoder = joblib.load('models/tech_node_encoder.joblib') |
| block_type_encoder = joblib.load('models/block_type_encoder.joblib') |
| ``` |
|
|
| ### REST API |
|
|
| ```bash |
| # Install |
| pip install fastapi uvicorn joblib xgboost lightgbm scikit-learn numpy |
| |
| # Run |
| MODEL_DIR=./models python inference_server.py |
| |
| # Call |
| curl -X POST http://localhost:7860/predict/estimate \ |
| -H "Content-Type: application/json" \ |
| -d '{ |
| "block_type": "PLL", |
| "tech_node": "7nm", |
| "priority": "P1-Critical", |
| "transistor_count": 80000, |
| "has_dependencies": true, |
| "num_dependencies": 3, |
| "constraint_complexity": 2.5, |
| "drc_iterations": 4 |
| }' |
| ``` |
|
|
| Response: |
| ```json |
| { |
| "complexity": "High", |
| "estimated_hours": 89.0, |
| "confidence": 0.996, |
| "risk_level": "high", |
| "reasoning": "Advanced 7nm node requires extensive DRC/LVS iterations...", |
| "recommended_drc_iterations": 4, |
| "suggested_engineer_skill_level": "senior", |
| "complexity_probabilities": {"High": 0.996, "Low": 0.0, "Medium": 0.003}, |
| "estimated_days": 11.1 |
| } |
| ``` |
|
|
| ## π‘ API Endpoints |
|
|
| | Method | Endpoint | Description | |
| |--------|----------|-------------| |
| | `POST` | `/predict/estimate` | Complexity & hours estimation (replaces Groq) | |
| | `POST` | `/predict/bottleneck` | Bottleneck risk prediction | |
| | `POST` | `/predict/completion` | Completion time prediction | |
| | `POST` | `/predict/bulk-estimate` | Bulk estimation (up to 200 blocks) | |
| | `GET` | `/model/metrics` | Model performance metrics | |
| | `GET` | `/model/supported-values` | Supported block types, tech nodes, etc. | |
| | `GET` | `/health` | Health check | |
|
|
| ## π ALWAS Integration |
|
|
| ### Replace Groq API in Express.js |
|
|
| **Before** (server/routes/blocks.js): |
| ```javascript |
| // Old: Groq LLM call ($0.002/request, 300ms latency) |
| const response = await groq.chat.completions.create({ |
| model: "llama-3.3-70b-versatile", |
| messages: [{ role: "user", content: prompt }] |
| }); |
| ``` |
|
|
| **After** (using ALWAS ML API): |
| ```javascript |
| // New: Local ML model (free, <5ms latency) |
| const response = await fetch('http://localhost:7860/predict/estimate', { |
| method: 'POST', |
| headers: { 'Content-Type': 'application/json' }, |
| body: JSON.stringify({ |
| block_type: block.type, |
| tech_node: block.techNode, |
| priority: block.priority, |
| transistor_count: block.transistorCount, |
| has_dependencies: block.dependencies?.length > 0, |
| num_dependencies: block.dependencies?.length || 0, |
| constraint_complexity: block.constraintComplexity || 1.0, |
| drc_iterations: block.drcIterations || 2 |
| }) |
| }); |
| const estimate = await response.json(); |
| ``` |
|
|
| ### Add Bottleneck Scanning to Cron Job |
|
|
| ```javascript |
| // In server/cron/bottleneckScanner.js |
| const blocks = await Block.find({ status: { $ne: 'Completed' } }); |
| |
| for (const block of blocks) { |
| const risk = await fetch('http://localhost:7860/predict/bottleneck', { |
| method: 'POST', |
| headers: { 'Content-Type': 'application/json' }, |
| body: JSON.stringify({ |
| block_type: block.type, |
| tech_node: block.techNode, |
| estimated_hours: block.estimatedHours, |
| hours_logged: block.hoursLogged, |
| current_stage: block.status, |
| days_in_current_stage: daysSinceLastTransition(block), |
| drc_violations_total: block.drcViolations, |
| is_overdue: new Date() > block.dueDate |
| }) |
| }); |
| const result = await risk.json(); |
| |
| if (result.should_alert) { |
| // Create notification for manager |
| await Notification.create({ |
| type: 'stuck', |
| message: `ML Alert: ${block.name} has HIGH bottleneck risk`, |
| recommendations: result.recommendations |
| }); |
| io.emit('newNotification', { blockId: block._id, risk: result }); |
| } |
| } |
| ``` |
|
|
| ### Add Completion ETA to Block Detail |
|
|
| ```javascript |
| // In GET /api/blocks/:id |
| const completion = await fetch('http://localhost:7860/predict/completion', { |
| method: 'POST', |
| headers: { 'Content-Type': 'application/json' }, |
| body: JSON.stringify({ |
| block_type: block.type, |
| tech_node: block.techNode, |
| estimated_hours: block.estimatedHours, |
| current_stage: block.status, |
| cumulative_hours: block.hoursLogged, |
| cumulative_days: daysSinceStart(block), |
| cumulative_drc_violations: block.drcViolations |
| }) |
| }); |
| const eta = await completion.json(); |
| // eta.remaining_hours, eta.estimated_completion_date, eta.progress_percent |
| ``` |
|
|
| ## π Supported Values |
|
|
| ### Block Types (20) |
| ADC, BGR, BandgapRef, Comparator, CurrentMirror, DAC, DiffAmp, LDO, LNA, LVDS_Driver, Mixer, OTA, Oscillator, PA, PLL, PowerDetector, SampleHold, SerDes, TIA, VCO |
| |
| ### Technology Nodes (8) |
| 5nm, 7nm, 12nm, 14nm, 22nm, 28nm, 45nm, 65nm |
| |
| ### Pipeline Stages (7) |
| Not Started β In Progress β DRC β LVS β ERC β Review β Completed |
| |
| ## π Feature Importance |
| |
| ### Hours Estimation β Top Features |
| 1. `transistor_count_log` (31.5%) β Most predictive: larger blocks take longer |
| 2. `transistor_count` (28.6%) β Raw count captures non-log relationships |
| 3. `engineer_skill_factor` (7.7%) β Skill level matters significantly |
| 4. `tech_node_encoded` (6.8%) β Advanced nodes are harder |
| 5. `constraint_complexity` (2.7%) β Analog constraints add overhead |
|
|
| ### Completion Prediction β Top Features |
| 1. `current_stage_idx` (44.9%) β Current stage is the strongest signal |
| 2. `stages_completed` (22.3%) β Progress through pipeline |
| 3. `avg_hours_per_stage_so_far` (21.0%) β Pace of work predicts future |
|
|
| ## π§ Retraining |
|
|
| ```bash |
| # Generate new training data from ALWAS MongoDB exports |
| python training/generate_dataset.py |
| |
| # Train all models |
| python training/train_models.py |
| python training/train_completion.py |
| ``` |
|
|
| **Recommended retraining schedule:** Monthly, or when >100 new completed blocks accumulate. |
|
|
| ## π¦ Files |
|
|
| ``` |
| models/ |
| hours_estimator.joblib # XGBoost regressor |
| complexity_xgb.joblib # XGBoost classifier (ensemble member) |
| complexity_lgb.joblib # LightGBM classifier (ensemble member) |
| bottleneck_predictor.joblib # Calibrated XGBoost classifier |
| completion_predictor.joblib # XGBoost regressor for remaining time |
| tech_node_encoder.joblib # LabelEncoder |
| block_type_encoder.joblib # LabelEncoder |
| priority_encoder.joblib # OrdinalEncoder |
| complexity_encoder.joblib # LabelEncoder |
| bottleneck_encoder.joblib # LabelEncoder |
| feature_config.json # Feature lists and supported values |
| metrics.json # Model evaluation metrics |
| inference_server.py # FastAPI inference server |
| training/ |
| generate_dataset.py # Synthetic data generator |
| train_models.py # Model training (Models 1-3) |
| train_completion.py # Completion model training (Model 4) |
| ``` |
|
|
| ## π Performance vs Groq API |
|
|
| | Metric | Groq llama-3.3-70b | ALWAS ML Models | |
| |--------|---------------------|-----------------| |
| | Latency | ~300ms | <5ms | |
| | Cost per request | $0.002 | Free | |
| | Internet required | Yes | No | |
| | Structured output | Sometimes | Always (JSON guaranteed) | |
| | Batch support | Limited | 200 blocks/call | |
| | Bottleneck detection | No | Yes (real-time) | |
| | Completion prediction | No | Yes (RΒ²=0.945) | |
| | Explainability | LLM narrative | Feature importance + reasoning | |
|
|
| ## License |
| MIT β Built for EPIC Build-A-Thon 2026 | Epical Layouts Pvt. Ltd. |
|
|