Credit Scoring XGBoost

Summary

A baseline XGBoost model for binary credit risk classification trained on the Home Credit dataset. The model is intended for experimentation and educational use, not production.

Model Details

Model type: XGBoost classifier
Task: Binary classification (default risk)
Input: Tabular features engineered from Home Credit raw tables
Output: Probability of default
Artifact: credit_scoring_xgb.pkl

Intended Use

Demonstrate a credit scoring pipeline with MLflow tracking
Provide a lightweight baseline for experimentation

Limitations

Not validated for real-world credit decisions
No fairness or regulatory compliance audit
Performance depends on data preprocessing used in the training pipeline

How to Use

Load the artifact locally with joblib:

import joblib
model = joblib.load("credit_scoring_xgb.pkl")
preds = model.predict_proba(X)[:, 1]

Input Expectations

The model expects the same feature engineering used in the training pipeline.
See the project notebooks for the exact preprocessing steps and column names.

Reproducibility

Training code and experiments are tracked in the main project repo:
- src/train.py
- notebooks/01_data_preparation.ipynb
- notebooks/03_model_comparison.ipynb

Training Data

Home Credit (public dataset). CSVs are tracked via Git LFS in the project repo.

Metrics

See the project repo notebooks for evaluation details and MLflow logs.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support