Spaces:

ash-coded-it
/

Solar_Culient_Predictor

Runtime error

App Files Files Community

Solar_Culient_Predictor / README.md

ash-coded-it

Upload folder using huggingface_hub

1e3f942 verified 7 months ago

preview code

raw

history blame contribute delete

3.74 kB

A newer version of the Gradio SDK is available: 6.12.0

Upgrade

metadata

title: Solar_Culient_Predictor
app_file: enhanced_app.py
sdk: gradio
sdk_version: 4.26.0

SOLAI Scoring Dashboard (Gradio)

A lightweight UI to train a baseline logistic regression on your solar leads dataset and generate probability_to_buy predictions. Uses the same feature candidates and preprocessing approach as scripts/batch_scoring.py.

Default dataset: examples/synthetic_v2
Outputs are always written to /Users/git/solai/scores and are also downloadable from the UI.

Features

Choose data source:
- Use preset example data: examples/synthetic_v2/leads_features.csv and examples/synthetic_v2/outcomes.csv
- Upload your own CSVs (features and outcomes)
Train + score with a single click
Evaluation metrics (test split):
- ROC AUC, PR AUC, Brier score (gracefully handles degenerate label cases)
Preview:
- predictions.csv (lead_id, probability_to_buy)
- leads_features_scored.csv (features merged with probability_to_buy)
Download both files from the UI in addition to saving to disk (/Users/git/solai/scores)

Requirements

Python 3.9+ recommended
macOS (as per environment), should also work on Linux/Windows

Install dependencies (ideally in a virtual environment):

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r dashboard_gradio/requirements.txt

Run the App

python dashboard_gradio/app.py

Gradio will launch on a local URL (typically http://127.0.0.1:7860). Open it in your browser.

Usage

Start the app.
Select a data source:
- Default: “Use example synthetic_v2”
- Or switch to “Upload CSVs” and provide:
  - Features CSV (must include lead_id and a subset of feature columns listed below)
  - Outcomes CSV (must include lead_id and sold columns)
Click “Train + Score”.
Review metrics and preview tables.
Download the generated files or find them on disk under /Users/git/solai/scores.

Expected Columns

Features CSV must contain lead_id and some subset of these candidate features:
- living_area_sqft
- average_monthly_kwh
- average_monthly_bill_usd
- shading_factor
- roof_suitability_score
- seasonality_index
- electric_panel_amperage
- has_pool
- is_remote_worker_household
- tdsp
- rate_structure
- credit_score_range
- household_income_bracket
- preferred_financing_type
- neighborhood_type
Outcomes CSV must contain:
- lead_id
- sold (0/1)

Outputs

Saved to /Users/git/solai/scores with a timestamp suffix:

predictions_YYYYMMDD_HHMMSS.csv
- Columns: lead_id, probability_to_buy
leads_features_scored_YYYYMMDD_HHMMSS.csv
- Original features merged with probability_to_buy

Both files are also offered as downloads directly in the UI.

Notes and Troubleshooting

If the outcomes data has only a single class (all sold=0 or all sold=1), ROC AUC and PR AUC are undefined; the app shows “N/A” for those metrics but still computes Brier score and produces predictions.
If you see “No candidate features found”, ensure your features CSV contains at least one of the listed feature names.
If port 7860 is in use, Gradio will choose another port automatically, displayed in the terminal.
For larger datasets, training time may increase but should remain quick for typical CSV sizes.

Development

Core logic is in dashboard_gradio/app.py.
The pipeline mirrors scripts/batch_scoring.py: ColumnTransformer with passthrough numeric features and OneHotEncoder for categoricals, then LogisticRegression.
Extend easily with additional visualizations (e.g., calibration plots), feature importance, or a data dictionary viewer.