Credit Scoring XGBoost
Summary
A baseline XGBoost model for binary credit risk classification trained on the Home Credit dataset. The model is intended for experimentation and educational use, not production.
Model Details
- Model type: XGBoost classifier
- Task: Binary classification (default risk)
- Input: Tabular features engineered from Home Credit raw tables
- Output: Probability of default
- Artifact:
credit_scoring_xgb.pkl
Intended Use
- Demonstrate a credit scoring pipeline with MLflow tracking
- Provide a lightweight baseline for experimentation
Limitations
- Not validated for real-world credit decisions
- No fairness or regulatory compliance audit
- Performance depends on data preprocessing used in the training pipeline
How to Use
Load the artifact locally with joblib:
import joblib
model = joblib.load("credit_scoring_xgb.pkl")
preds = model.predict_proba(X)[:, 1]
Input Expectations
- The model expects the same feature engineering used in the training pipeline.
- See the project notebooks for the exact preprocessing steps and column names.
Reproducibility
- Training code and experiments are tracked in the main project repo:
src/train.pynotebooks/01_data_preparation.ipynbnotebooks/03_model_comparison.ipynb
Training Data
Home Credit (public dataset). CSVs are tracked via Git LFS in the project repo.
Metrics
See the project repo notebooks for evaluation details and MLflow logs.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support