README.md · Builder-Neekhil/relationship-longevity-predictor at b7c64e8259999f25fb89979a2fed1340be2d09ce

relationship-longevity-predictor / README.md

Builder-Neekhil

Upload README.md

edc7217 verified about 1 month ago

preview code

raw

history blame

7.77 kB

	# 💕 Relationship Longevity Predictor

	An ensemble ML model that predicts relationship compatibility and longevity based on personal and professional profiles of two individuals.

	This is a predictability engine — not a matchmaking system. Given two individuals' personal attributes, values, interests, and personality traits, it predicts how likely their relationship is to succeed.

	## Model Architecture

	Ensemble of 3 gradient-boosted tree models with 113 engineered dyadic features:

	\| Model \| Weight \| AUC-ROC \| F1 \|
	\|-------\|--------\|---------\|----\|
	\| XGBoost \| 0.40 \| 0.8852 \| 0.6013 \|
	\| LightGBM \| 0.35 \| 0.8912 \| 0.6351 \|
	\| CatBoost \| 0.25 \| 0.8661 \| 0.5974 \|
	\| Ensemble \| — \| 0.8842 \| 0.6198 \|

	Best single model: LightGBM (AUC-ROC = 0.891, F1 = 0.635)

	## Performance Metrics (5-Fold Cross-Validation)

	\| Metric \| Value \|
	\|--------\|-------\|
	\| AUC-ROC \| 0.891 \|
	\| AUC-PR \| 0.699 \|
	\| Accuracy \| 85.3% \|
	\| F1 Score \| 0.635 \|
	\| Precision \| 56.8% \|
	\| Recall \| 72.0% \|
	\| Brier Score \| 0.101 \|

	## What the Model Learns

	### Top Predictive Features (from SHAP analysis)

	1. Attractiveness Perception Product — Mutual physical attraction between partners
	2. Probability Partner Wants to Date — Perceived reciprocal interest
	3. Humor Perception Product — Shared sense of humor (mutual ratings)
	4. Total Self-Awareness Gap — How accurately people perceive themselves vs how partners see them
	5. Interest Correlation — Overlap in hobbies and interests
	6. Shared Interests Score — Partner's rating of shared interests
	7. Interest Diversity — Breadth of the dater's interests
	8. Confidence Calibration — How well people predict their own attractiveness to others
	9. Intelligence Value Fulfillment — Whether the partner meets intelligence expectations
	10. Expectation Meets Reality — Gap between expected and actual satisfaction

	### Key Insights

	- Mutual attraction matters most — but it's the product (both people finding each other attractive) that predicts success, not just one-sided attraction
	- Humor compatibility ranks #3 — couples who both rate each other as funny are much more likely to match
	- Self-awareness is a strong predictor — people who accurately assess how others see them tend to form better partnerships
	- Shared interests matter significantly — the correlation between interests is more predictive than any single interest
	- Value alignment (what you care about vs what your partner delivers) drives long-term compatibility

	## Feature Engineering

	113 features in 10 categories, engineered from raw dyadic profiles:

	\| Category \| Features \| Description \|
	\|----------\|----------\|-------------\|
	\| Perception Gap \| 5 \| How you rate your partner vs how they rate you (per trait) \|
	\| Mutual Scores \| 5 \| Average of both partners' ratings (per trait) \|
	\| Perception Products \| 5 \| Multiplicative interaction of mutual ratings \|
	\| Value Fulfillment \| 5 \| Does your partner deliver what you value most? \|
	\| Self-Awareness \| 5 \| Self-perception vs partner perception gap \|
	\| Age Features \| 4 \| Gap, gap², is_older, combined age \|
	\| Interest Features \| 5 \| Diversity, intensity, range, correlation \|
	\| Importance Alignment \| 8 \| Do both people value the same traits? \|
	\| Expectation Features \| 2 \| Expectation calibration, meets-reality score \|
	\| Demographics \| 4 \| Race match, gender, same race importance \|
	\| Raw Profiles \| ~65 \| Original personality ratings, interests, preferences \|

	## Training Data

	Fisman Speed Dating Experiment ([mstz/speeddating](https://hf.co/datasets/mstz/speeddating))
	- 1,048 speed-dating encounters between participants
	- Columbia Business School, 2002-2004
	- 17.7% positive match rate (class imbalance handled via scale_pos_weight + balanced weighting)

	## Usage

	```python
	import joblib
	import json
	import numpy as np
	from catboost import CatBoostClassifier

	# Load models
	xgb = joblib.load("xgboost_model.joblib")
	lgb = joblib.load("lightgbm_model.joblib")
	cat = CatBoostClassifier()
	cat.load_model("catboost_model.cbm")
	feature_cols = joblib.load("feature_columns.joblib")

	with open("ensemble_config.json") as f:
	config = json.load(f)

	# Prepare feature vector (113 features — see feature_columns.joblib)
	# features = pd.DataFrame([your_feature_vector], columns=feature_cols)

	# Predict
	xgb_prob = xgb.predict_proba(features)[:, 1]
	lgb_prob = lgb.predict_proba(features)[:, 1]
	cat_prob = cat.predict_proba(features)[:, 1]

	# Ensemble
	score = 0.4 * xgb_prob + 0.35 * lgb_prob + 0.25 * cat_prob

	# Interpret
	if score >= 0.7:
	print("High Compatibility ❤️")
	elif score >= 0.4:
	print("Moderate Compatibility 💛")
	else:
	print("Low Compatibility 💔")
	```

	## Visualizations

	### ROC Curves
	![ROC Curves](figures/roc_curves.png)

	### Feature Importance
	![Feature Importance](figures/feature_importance.png)

	### SHAP Summary
	![SHAP Summary](figures/shap_summary.png)

	### SHAP Dependence (Top 6 Features)
	![SHAP Dependence](figures/shap_dependence.png)

	### Confusion Matrix
	![Confusion Matrix](figures/confusion_matrix.png)

	### Prediction Distribution
	![Probability Distribution](figures/probability_distribution.png)

	## Literature Basis

	\| Paper \| Contribution \|
	\|-------\|-------------\|
	\| Grinsztajn et al. (NeurIPS 2022) — "Why do tree-based models still outperform deep learning on tabular data?" \| Validated XGBoost/LightGBM as SOTA for tabular data with <100K rows \|
	\| Fisman et al. (QJE 2006) — "Gender Differences in Mate Selection" \| Original speed dating experiment; ~70% accuracy with logistic regression \|
	\| Gorishniy et al. (NeurIPS 2021) — "Revisiting Deep Learning Models for Tabular Data" \| FT-Transformer architecture for tabular; confirmed tree superiority on small datasets \|
	\| Savcisens et al. (Nature Human Behaviour 2024) — "Using Sequences of Life-events to Predict Human Lives" \| life2vec — longitudinal life-event prediction; architecture reusable for dyadic temporal modeling \|

	## Limitations

	- Short-term proxy: The training data captures initial match decisions (4-minute speed dates), not long-term relationship outcomes. The model predicts initial compatibility, which is a proxy for — but not equivalent to — relationship longevity.
	- Sample demographics: Columbia University students (2002-2004) — may not generalize to all demographics/cultures.
	- Static features only: No temporal/interaction data. Adding communication patterns, life events, or behavioral signals would significantly improve longevity prediction (see life2vec approach).
	- Class imbalance: 17.7% match rate means the model is well-calibrated for rejection but less certain for positive predictions.

	## Files

	\| File \| Description \|
	\|------\|-------------\|
	\| `xgboost_model.joblib` \| XGBoost classifier (2000 trees) \|
	\| `lightgbm_model.joblib` \| LightGBM classifier (2000 trees) \|
	\| `catboost_model.cbm` \| CatBoost classifier (2000 iterations) \|
	\| `ensemble_config.json` \| Ensemble weights, threshold, feature list, metrics \|
	\| `feature_columns.joblib` \| Ordered list of 113 feature column names \|
	\| `race_encoder.joblib` \| LabelEncoder for race categories \|
	\| `evaluation_results.csv` \| Full evaluation metrics table \|
	\| `feature_importance.csv` \| Feature importance rankings from XGB + LGB \|
	\| `predictor.py` \| Prediction interface class \|
	\| `train_relationship_predictor.py` \| Full training script (reproducible) \|
	\| `figures/` \| All visualizations (ROC, SHAP, confusion matrix, etc.) \|

	## License

	Research use. Based on publicly available academic dataset.

	---

	Built with XGBoost, LightGBM, CatBoost, SHAP, and scikit-learn.

	# 💕 Relationship Longevity Predictor

	An ensemble ML model that predicts relationship compatibility and longevity based on personal and professional profiles of two individuals.

	This is a predictability engine — not a matchmaking system. Given two individuals' personal attributes, values, interests, and personality traits, it predicts how likely their relationship is to succeed.

	## Model Architecture

	Ensemble of 3 gradient-boosted tree models with 113 engineered dyadic features:

	\| Model \| Weight \| AUC-ROC \| F1 \|
	\|-------\|--------\|---------\|----\|
	\| XGBoost \| 0.40 \| 0.8852 \| 0.6013 \|
	\| LightGBM \| 0.35 \| 0.8912 \| 0.6351 \|
	\| CatBoost \| 0.25 \| 0.8661 \| 0.5974 \|
	\| Ensemble \| — \| 0.8842 \| 0.6198 \|

	Best single model: LightGBM (AUC-ROC = 0.891, F1 = 0.635)

	## Performance Metrics (5-Fold Cross-Validation)

	\| Metric \| Value \|
	\|--------\|-------\|
	\| AUC-ROC \| 0.891 \|
	\| AUC-PR \| 0.699 \|
	\| Accuracy \| 85.3% \|
	\| F1 Score \| 0.635 \|
	\| Precision \| 56.8% \|
	\| Recall \| 72.0% \|
	\| Brier Score \| 0.101 \|

	## What the Model Learns

	### Top Predictive Features (from SHAP analysis)

	1. Attractiveness Perception Product — Mutual physical attraction between partners
	2. Probability Partner Wants to Date — Perceived reciprocal interest
	3. Humor Perception Product — Shared sense of humor (mutual ratings)
	4. Total Self-Awareness Gap — How accurately people perceive themselves vs how partners see them
	5. Interest Correlation — Overlap in hobbies and interests
	6. Shared Interests Score — Partner's rating of shared interests
	7. Interest Diversity — Breadth of the dater's interests
	8. Confidence Calibration — How well people predict their own attractiveness to others
	9. Intelligence Value Fulfillment — Whether the partner meets intelligence expectations
	10. Expectation Meets Reality — Gap between expected and actual satisfaction

	### Key Insights

	- Mutual attraction matters most — but it's the product (both people finding each other attractive) that predicts success, not just one-sided attraction
	- Humor compatibility ranks #3 — couples who both rate each other as funny are much more likely to match
	- Self-awareness is a strong predictor — people who accurately assess how others see them tend to form better partnerships
	- Shared interests matter significantly — the correlation between interests is more predictive than any single interest
	- Value alignment (what you care about vs what your partner delivers) drives long-term compatibility

	## Feature Engineering

	113 features in 10 categories, engineered from raw dyadic profiles:

	\| Category \| Features \| Description \|
	\|----------\|----------\|-------------\|
	\| Perception Gap \| 5 \| How you rate your partner vs how they rate you (per trait) \|
	\| Mutual Scores \| 5 \| Average of both partners' ratings (per trait) \|
	\| Perception Products \| 5 \| Multiplicative interaction of mutual ratings \|
	\| Value Fulfillment \| 5 \| Does your partner deliver what you value most? \|
	\| Self-Awareness \| 5 \| Self-perception vs partner perception gap \|
	\| Age Features \| 4 \| Gap, gap², is_older, combined age \|
	\| Interest Features \| 5 \| Diversity, intensity, range, correlation \|
	\| Importance Alignment \| 8 \| Do both people value the same traits? \|
	\| Expectation Features \| 2 \| Expectation calibration, meets-reality score \|
	\| Demographics \| 4 \| Race match, gender, same race importance \|
	\| Raw Profiles \| ~65 \| Original personality ratings, interests, preferences \|

	## Training Data

	Fisman Speed Dating Experiment ([mstz/speeddating](https://hf.co/datasets/mstz/speeddating))
	- 1,048 speed-dating encounters between participants
	- Columbia Business School, 2002-2004
	- 17.7% positive match rate (class imbalance handled via scale_pos_weight + balanced weighting)

	## Usage

	```python
	import joblib
	import json
	import numpy as np
	from catboost import CatBoostClassifier

	# Load models
	xgb = joblib.load("xgboost_model.joblib")
	lgb = joblib.load("lightgbm_model.joblib")
	cat = CatBoostClassifier()
	cat.load_model("catboost_model.cbm")
	feature_cols = joblib.load("feature_columns.joblib")

	with open("ensemble_config.json") as f:
	config = json.load(f)

	# Prepare feature vector (113 features — see feature_columns.joblib)
	# features = pd.DataFrame([your_feature_vector], columns=feature_cols)

	# Predict
	xgb_prob = xgb.predict_proba(features)[:, 1]
	lgb_prob = lgb.predict_proba(features)[:, 1]
	cat_prob = cat.predict_proba(features)[:, 1]

	# Ensemble
	score = 0.4 * xgb_prob + 0.35 * lgb_prob + 0.25 * cat_prob

	# Interpret
	if score >= 0.7:
	print("High Compatibility ❤️")
	elif score >= 0.4:
	print("Moderate Compatibility 💛")
	else:
	print("Low Compatibility 💔")
	```

	## Visualizations

	### ROC Curves
	![ROC Curves](figures/roc_curves.png)

	### Feature Importance
	![Feature Importance](figures/feature_importance.png)

	### SHAP Summary
	![SHAP Summary](figures/shap_summary.png)

	### SHAP Dependence (Top 6 Features)
	![SHAP Dependence](figures/shap_dependence.png)

	### Confusion Matrix
	![Confusion Matrix](figures/confusion_matrix.png)

	### Prediction Distribution
	![Probability Distribution](figures/probability_distribution.png)

	## Literature Basis

	\| Paper \| Contribution \|
	\|-------\|-------------\|
	\| Grinsztajn et al. (NeurIPS 2022) — "Why do tree-based models still outperform deep learning on tabular data?" \| Validated XGBoost/LightGBM as SOTA for tabular data with <100K rows \|
	\| Fisman et al. (QJE 2006) — "Gender Differences in Mate Selection" \| Original speed dating experiment; ~70% accuracy with logistic regression \|
	\| Gorishniy et al. (NeurIPS 2021) — "Revisiting Deep Learning Models for Tabular Data" \| FT-Transformer architecture for tabular; confirmed tree superiority on small datasets \|
	\| Savcisens et al. (Nature Human Behaviour 2024) — "Using Sequences of Life-events to Predict Human Lives" \| life2vec — longitudinal life-event prediction; architecture reusable for dyadic temporal modeling \|

	## Limitations

	- Short-term proxy: The training data captures initial match decisions (4-minute speed dates), not long-term relationship outcomes. The model predicts initial compatibility, which is a proxy for — but not equivalent to — relationship longevity.
	- Sample demographics: Columbia University students (2002-2004) — may not generalize to all demographics/cultures.
	- Static features only: No temporal/interaction data. Adding communication patterns, life events, or behavioral signals would significantly improve longevity prediction (see life2vec approach).
	- Class imbalance: 17.7% match rate means the model is well-calibrated for rejection but less certain for positive predictions.

	## Files

	\| File \| Description \|
	\|------\|-------------\|
	\| `xgboost_model.joblib` \| XGBoost classifier (2000 trees) \|
	\| `lightgbm_model.joblib` \| LightGBM classifier (2000 trees) \|
	\| `catboost_model.cbm` \| CatBoost classifier (2000 iterations) \|
	\| `ensemble_config.json` \| Ensemble weights, threshold, feature list, metrics \|
	\| `feature_columns.joblib` \| Ordered list of 113 feature column names \|
	\| `race_encoder.joblib` \| LabelEncoder for race categories \|
	\| `evaluation_results.csv` \| Full evaluation metrics table \|
	\| `feature_importance.csv` \| Feature importance rankings from XGB + LGB \|
	\| `predictor.py` \| Prediction interface class \|
	\| `train_relationship_predictor.py` \| Full training script (reproducible) \|
	\| `figures/` \| All visualizations (ROC, SHAP, confusion matrix, etc.) \|

	## License

	Research use. Based on publicly available academic dataset.

	---

	Built with XGBoost, LightGBM, CatBoost, SHAP, and scikit-learn.