# 💕 Relationship Longevity Predictor **An ensemble ML model that predicts relationship compatibility and longevity based on personal and professional profiles of two individuals.** This is a **predictability engine** — not a matchmaking system. Given two individuals' personal attributes, values, interests, and personality traits, it predicts how likely their relationship is to succeed. ## Model Architecture **Ensemble of 3 gradient-boosted tree models** with 113 engineered dyadic features: | Model | Weight | AUC-ROC | F1 | |-------|--------|---------|----| | XGBoost | 0.40 | 0.8852 | 0.6013 | | **LightGBM** | **0.35** | **0.8912** | **0.6351** | | CatBoost | 0.25 | 0.8661 | 0.5974 | | **Ensemble** | — | **0.8842** | **0.6198** | **Best single model: LightGBM (AUC-ROC = 0.891, F1 = 0.635)** ## Performance Metrics (5-Fold Cross-Validation) | Metric | Value | |--------|-------| | **AUC-ROC** | 0.891 | | **AUC-PR** | 0.699 | | **Accuracy** | 85.3% | | **F1 Score** | 0.635 | | **Precision** | 56.8% | | **Recall** | 72.0% | | **Brier Score** | 0.101 | ## What the Model Learns ### Top Predictive Features (from SHAP analysis) 1. **Attractiveness Perception Product** — Mutual physical attraction between partners 2. **Probability Partner Wants to Date** — Perceived reciprocal interest 3. **Humor Perception Product** — Shared sense of humor (mutual ratings) 4. **Total Self-Awareness Gap** — How accurately people perceive themselves vs how partners see them 5. **Interest Correlation** — Overlap in hobbies and interests 6. **Shared Interests Score** — Partner's rating of shared interests 7. **Interest Diversity** — Breadth of the dater's interests 8. **Confidence Calibration** — How well people predict their own attractiveness to others 9. **Intelligence Value Fulfillment** — Whether the partner meets intelligence expectations 10. **Expectation Meets Reality** — Gap between expected and actual satisfaction ### Key Insights - **Mutual attraction matters most** — but it's the *product* (both people finding each other attractive) that predicts success, not just one-sided attraction - **Humor compatibility** ranks #3 — couples who both rate each other as funny are much more likely to match - **Self-awareness** is a strong predictor — people who accurately assess how others see them tend to form better partnerships - **Shared interests** matter significantly — the correlation between interests is more predictive than any single interest - **Value alignment** (what you care about vs what your partner delivers) drives long-term compatibility ## Feature Engineering 113 features in 10 categories, engineered from raw dyadic profiles: | Category | Features | Description | |----------|----------|-------------| | **Perception Gap** | 5 | How you rate your partner vs how they rate you (per trait) | | **Mutual Scores** | 5 | Average of both partners' ratings (per trait) | | **Perception Products** | 5 | Multiplicative interaction of mutual ratings | | **Value Fulfillment** | 5 | Does your partner deliver what you value most? | | **Self-Awareness** | 5 | Self-perception vs partner perception gap | | **Age Features** | 4 | Gap, gap², is_older, combined age | | **Interest Features** | 5 | Diversity, intensity, range, correlation | | **Importance Alignment** | 8 | Do both people value the same traits? | | **Expectation Features** | 2 | Expectation calibration, meets-reality score | | **Demographics** | 4 | Race match, gender, same race importance | | **Raw Profiles** | ~65 | Original personality ratings, interests, preferences | ## Training Data **Fisman Speed Dating Experiment** ([mstz/speeddating](https://hf.co/datasets/mstz/speeddating)) - 1,048 speed-dating encounters between participants - Columbia Business School, 2002-2004 - 17.7% positive match rate (class imbalance handled via scale_pos_weight + balanced weighting) ## Usage ```python import joblib import json import numpy as np from catboost import CatBoostClassifier # Load models xgb = joblib.load("xgboost_model.joblib") lgb = joblib.load("lightgbm_model.joblib") cat = CatBoostClassifier() cat.load_model("catboost_model.cbm") feature_cols = joblib.load("feature_columns.joblib") with open("ensemble_config.json") as f: config = json.load(f) # Prepare feature vector (113 features — see feature_columns.joblib) # features = pd.DataFrame([your_feature_vector], columns=feature_cols) # Predict xgb_prob = xgb.predict_proba(features)[:, 1] lgb_prob = lgb.predict_proba(features)[:, 1] cat_prob = cat.predict_proba(features)[:, 1] # Ensemble score = 0.4 * xgb_prob + 0.35 * lgb_prob + 0.25 * cat_prob # Interpret if score >= 0.7: print("High Compatibility ❤️") elif score >= 0.4: print("Moderate Compatibility 💛") else: print("Low Compatibility 💔") ``` ## Visualizations ### ROC Curves ![ROC Curves](figures/roc_curves.png) ### Feature Importance ![Feature Importance](figures/feature_importance.png) ### SHAP Summary ![SHAP Summary](figures/shap_summary.png) ### SHAP Dependence (Top 6 Features) ![SHAP Dependence](figures/shap_dependence.png) ### Confusion Matrix ![Confusion Matrix](figures/confusion_matrix.png) ### Prediction Distribution ![Probability Distribution](figures/probability_distribution.png) ## Literature Basis | Paper | Contribution | |-------|-------------| | Grinsztajn et al. (NeurIPS 2022) — *"Why do tree-based models still outperform deep learning on tabular data?"* | Validated XGBoost/LightGBM as SOTA for tabular data with <100K rows | | Fisman et al. (QJE 2006) — *"Gender Differences in Mate Selection"* | Original speed dating experiment; ~70% accuracy with logistic regression | | Gorishniy et al. (NeurIPS 2021) — *"Revisiting Deep Learning Models for Tabular Data"* | FT-Transformer architecture for tabular; confirmed tree superiority on small datasets | | Savcisens et al. (Nature Human Behaviour 2024) — *"Using Sequences of Life-events to Predict Human Lives"* | life2vec — longitudinal life-event prediction; architecture reusable for dyadic temporal modeling | ## Limitations - **Short-term proxy**: The training data captures initial match decisions (4-minute speed dates), not long-term relationship outcomes. The model predicts initial compatibility, which is a proxy for — but not equivalent to — relationship longevity. - **Sample demographics**: Columbia University students (2002-2004) — may not generalize to all demographics/cultures. - **Static features only**: No temporal/interaction data. Adding communication patterns, life events, or behavioral signals would significantly improve longevity prediction (see life2vec approach). - **Class imbalance**: 17.7% match rate means the model is well-calibrated for rejection but less certain for positive predictions. ## Files | File | Description | |------|-------------| | `xgboost_model.joblib` | XGBoost classifier (2000 trees) | | `lightgbm_model.joblib` | LightGBM classifier (2000 trees) | | `catboost_model.cbm` | CatBoost classifier (2000 iterations) | | `ensemble_config.json` | Ensemble weights, threshold, feature list, metrics | | `feature_columns.joblib` | Ordered list of 113 feature column names | | `race_encoder.joblib` | LabelEncoder for race categories | | `evaluation_results.csv` | Full evaluation metrics table | | `feature_importance.csv` | Feature importance rankings from XGB + LGB | | `predictor.py` | Prediction interface class | | `train_relationship_predictor.py` | Full training script (reproducible) | | `figures/` | All visualizations (ROC, SHAP, confusion matrix, etc.) | ## License Research use. Based on publicly available academic dataset. --- *Built with XGBoost, LightGBM, CatBoost, SHAP, and scikit-learn.*