Model Card for Pothole Severity Scoring
Model Details
Model Description
This is an XGBoost Regressor model designed to predict the priority/severity score of civic infrastructure issues (specifically potholes). It evaluates multiple structural, environmental, and temporal features to output a severity score bounded between 0 and 1, assisting civic authorities in prioritizing repairs and resource allocation.
- Developed by: Civic AI System (Demo)
- Model type: XGBoost Regressor
- License: MIT
Uses
Direct Use
The model natively ingests 10 engineered features characterizing a reported pothole and outputs:
- A numeric severity score ($S \in [0,1]$).
- A qualitative priority label ("Low", "Medium", "High").
This is intended for sorting and prioritizing civil work dispatch queues.
Bias, Risks, and Limitations
The model heavily factors in proximity to critical infrastructure (P) and road hierarchy (R). While this effectively prioritizes areas like highways and hospitals, it may systematically delay repairs in neglected or local neighborhoods if those areas lack designated local "critical infrastructure". Disparate impact assessments should be run periodically to ensure equitable civic maintenance.
Training Details
Training Data
The model was trained on a synthetically generated dataset of 10,000 samples designed to mirror realistic distributions of civic reporting. Features include:
A: Defect area ratioD: Defect densityC: Centrality (distance from center)Q: Initial detection confidenceM: Multi-user confirmation scoreT: Temporal persistence (days unresolved)R: Traffic importance tierP: Proximity to critical infrastructureF: Recurrence frequencyX: Resolution failure count
All features are min-max scaled [0,1].
Training Procedure
- Algorithm: XGBoost
- Objective:
reg:squarederror - Trees: 200
- Max Depth: 5
- Learning Rate: 0.05
π Performance & Interpretability
Model Metrics
The model demonstrates high precision in predicting the severity score $S$, which controls civic resource allocation.
| Metric | Value | Interpretation |
|---|---|---|
| RMSE | 0.0156 | Low average error |
| MAE | 0.0112 | High predictive accuracy |
| RΒ² Score | 0.9418 | 94% of variance explained by features |
Feature Importance (Gain)
The following ranking describes how much each feature contributes to the XGBoost tree construction:
- C (Centrality): 0.3585 β Central potholes pose higher collision risks.
- A (Area Ratio): 0.2187 β Size of the defect is a primary driver.
- R (Road Type): 0.1629 β Priority given to highways over local streets.
- P (Proximity): 0.0937 β Closeness to critical infrastructure.
SHAP Visualizations
We use SHAP (SHapley Additive exPlanations) to explain individual predictions and global feature influence.
Global Feature Impact
The bar chart below shows the mean absolute SHAP value, identifying which features consistently shift the severity score.
Detailed Impact (Beeswarm)
The summary plot shows how high vs. low values of a feature affect the outcome. For example, high values of C (Centrality) push the score significantly higher.

