# SOLAI Enhanced Dashboard A comprehensive multi-tab dashboard for Texas residential solar lead scoring and analytics. This enhanced version builds upon the original Gradio dashboard with advanced analytics, lead management, and market intelligence capabilities. ## 🌟 Features ### 📊 Overview Dashboard - **Key Performance Metrics**: Total leads, qualified leads, sold leads, conversion rates - **Geographic Distribution**: Lead distribution by TDSP (Texas utility territories) - **Conversion Funnel**: Visual pipeline from leads to sales - **Lead Score Distribution**: Histogram of probability_to_buy scores ### 👥 Lead Management - **Advanced Search & Filtering**: Search by lead ID, TDSP, or sales status - **Detailed Lead Profiles**: Comprehensive view of individual lead data - **Demographics & Property Information**: Age, income, property details - **Energy Usage & Solar Potential**: Monthly usage, bills, solar projections - **Sales Status Tracking**: Win/loss reasons and current status ### ⚡ Utility & Market Intelligence - **TDSP Analysis**: Lead volume and conversion rates by utility territory - **Rate Structure Impact**: How different rate structures affect conversions - **Solar Potential Analysis**: Distribution of solar potential for sold vs not sold leads - **Market Summary**: Top performing utility territories ### 🤖 Enhanced ML Training & Scoring - **Improved Model Metrics**: ROC AUC, PR AUC, Brier Score, Precision, Recall, F1-Score - **Feature Importance**: Clear breakdown of numeric vs categorical features - **Confusion Matrix**: Detailed classification performance - **Enhanced Output**: Better formatted results with comprehensive statistics ### 📚 Data Dictionary - **Interactive Field Browser**: Explore all 100+ fields in the SOLAI schema - **Field Descriptions**: Detailed explanations from the data dictionary - **PII Identification**: Clear marking of personally identifiable information - **Grouped by Category**: Organized by logical field groups (demographics, property, utility, etc.) ## 🚀 Quick Start ### Prerequisites - Python 3.9+ recommended - Virtual environment (recommended) ### Installation 1. **Set up virtual environment:** ```bash python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate ``` 2. **Install dependencies:** ```bash pip install -r dashboard_gradio/requirements.txt ``` ### Running the Enhanced Dashboard ```bash python dashboard_gradio/enhanced_app.py ``` The dashboard will launch at `http://127.0.0.1:7860` with the following tabs: - 📊 Overview Dashboard - 👥 Lead Management - ⚡ Utility & Market Intelligence - 🤖 ML Training & Scoring - 📚 Data Dictionary ## 📋 Usage Guide ### Overview Dashboard 1. **Automatic Loading**: Key metrics load automatically when you open the tab 2. **Refresh Data**: Click "🔄 Refresh Dashboard" to update all charts and metrics 3. **Interactive Charts**: Hover over charts for detailed information ### Lead Management 1. **Search Leads**: - Enter search terms in the search box - Filter by TDSP (utility territory) - Filter by sales status (Sold/Not Sold) - Click "🔍 Search Leads" to apply filters 2. **View Lead Details**: - Enter a specific lead ID - Click "📋 Get Lead Details" for comprehensive lead profile - View demographics, property info, energy usage, and sales status ### Utility & Market Intelligence 1. **Market Analysis**: View lead distribution and conversion rates by utility territory 2. **Rate Structure Impact**: Understand how different rate structures affect sales 3. **Solar Potential**: Compare solar potential between sold and unsold leads ### ML Training & Scoring 1. **Choose Data Source**: - "Use example synthetic_v2" for demo data - "Upload CSVs" to use your own data files 2. **Train Model**: Click "🚀 Train + Score Model" to: - Train logistic regression model - Generate predictions for all leads - Download results as CSV files 3. **Review Results**: - Enhanced metrics including precision, recall, F1-score - Feature importance breakdown - Preview of predictions and scored data ### Data Dictionary - **Browse Fields**: Explore all available fields in the SOLAI schema - **Field Details**: View descriptions, data types, and PII status - **Organized Groups**: Fields grouped by category for easy navigation ## 🔧 Configuration ### Data Sources - **Default Dataset**: `examples/synthetic_v2/leads_features.csv` and `examples/synthetic_v2/outcomes.csv` - **Output Directory**: `scores/` (automatically created) - **Data Dictionary**: `data_dictionary.yaml` ### Customization - **Port**: Default 7860, automatically finds alternative if busy - **Theme**: Uses Gradio Soft theme for better aesthetics - **Charts**: Plotly-based interactive visualizations ## 📊 Data Requirements ### Features CSV Must contain `lead_id` and subset of these fields: - **Demographics**: `age_bracket`, `household_income_bracket`, `credit_score_range` - **Property**: `living_area_sqft`, `property_age_years`, `roof_material` - **Energy**: `average_monthly_kwh`, `average_monthly_bill_usd`, `tdsp` - **Solar**: `shading_factor`, `roof_suitability_score`, `solar_potential_kwh_year` ### Outcomes CSV Must contain: - `lead_id`: Unique identifier matching features CSV - `sold`: Boolean (0/1) indicating if lead converted to sale ## 🛠️ Technical Details ### Architecture - **Frontend**: Gradio with custom HTML/CSS styling - **Backend**: Python with pandas, scikit-learn, plotly - **ML Pipeline**: Logistic regression with preprocessing pipeline - **Data Processing**: Pandas for data manipulation and analysis ### Performance - **Dataset Size**: Optimized for datasets up to 10,000+ leads - **Response Time**: Sub-second for most operations - **Memory Usage**: Efficient pandas operations with data caching ### Security & Privacy - **PII Handling**: Clear identification of personally identifiable information - **Data Separation**: Follows SOLAI schema for PII/features separation - **Local Processing**: All data processing happens locally ## 🔍 Troubleshooting ### Common Issues 1. **"No data available"**: - Ensure `examples/synthetic_v2/` directory exists with CSV files - Check file permissions and paths 2. **Charts not loading**: - Verify plotly installation: `pip install plotly` - Check browser JavaScript console for errors 3. **Upload errors**: - Ensure CSV files have required columns (`lead_id`, `sold`) - Check file format and encoding (UTF-8 recommended) 4. **Port conflicts**: - Gradio will automatically find alternative port - Check terminal output for actual URL ### Performance Tips - **Large Datasets**: Consider sampling for initial exploration - **Memory Usage**: Close unused tabs to free memory - **Browser Performance**: Use modern browsers (Chrome, Firefox, Safari) ## 🚀 Future Enhancements ### Planned Features - **Real-time Data**: Database connectivity for live data - **Advanced ML**: Additional algorithms and ensemble methods - **Export Options**: PDF reports and PowerPoint exports - **User Authentication**: Multi-user support with role-based access - **API Integration**: REST API for external system integration ### Customization Options - **Custom Themes**: Additional color schemes and layouts - **Dashboard Builder**: Drag-and-drop dashboard customization - **Alert System**: Automated alerts for key metrics - **Scheduled Reports**: Automated report generation and distribution ## 📞 Support For issues, questions, or feature requests: 1. Check this README for common solutions 2. Review the original `dashboard_gradio/README.md` for basic functionality 3. Examine the data dictionary (`data_dictionary.yaml`) for field definitions 4. Test with the provided synthetic data before using custom datasets ## 📄 License This enhanced dashboard builds upon the original SOLAI project and maintains the same licensing terms.