| --- |
| license: cc-by-4.0 |
| library_name: scikit-learn |
| tags: |
| - hackathon |
| - tabular-classification |
| - archetype |
| - alphahack |
| pipeline_tag: tabular-classification |
| --- |
| |
| # AlphaHack Model 1 — Event Regime Classifier |
|
|
| A `GradientBoostingClassifier` (with a logistic-regression fallback, |
| LabelEncoder, and StandardScaler bundled in the same pickle) that |
| predicts which **winner archetype** dominates a given hackathon event. |
|
|
| Companion: [Model 2 — winner predictor](https://huggingface.co/xenosaac/alphahack-models/tree/main/model2-winner-predictor) |
|
|
| ## Model description |
|
|
| Given event-level features (host type, judge composition, prize pool, |
| duration, theme keywords, criteria text, etc.), the classifier predicts |
| which of 5 archetypes describes the event's prior winners: |
|
|
| | Archetype | Train rows | |
| |---|---| |
| | `tech_showoff` | 1,051 | |
| | `empathy_play` | 825 | |
| | `scrappy_utility` | 761 | |
| | `hype_surfer` | 717 | |
| | `narrative_master` | 658 | |
|
|
| Use the model to inform **idea-generation strategy** before a hackathon |
| — not to rank individual project submissions (use Model 2 for that). |
|
|
| ## Training data |
|
|
| The model was fit on **4,012 event-level archetype labels** derived from |
| **23,785 winning-project rows** across **101,682 total projects** in |
| the [`xenosaac/alphahack-devpost`](https://huggingface.co/datasets/xenosaac/alphahack-devpost) |
| dataset. Source feature parquet: |
| `data/merged/alphahack_features_v7.parquet` (151 columns post-PII-scrub). |
|
|
| ## Metrics |
|
|
| 5-fold cross-validation, GroupKFold by `event_id` (no event appears in |
| both train and test of the same fold). |
|
|
| | Metric | GBC | LR baseline | Majority baseline | |
| |---|---|---|---| |
| | Top-1 accuracy | **0.381 ± 0.012** | 0.275 | 0.262 | |
| | Top-3 accuracy | **0.804 ± 0.010** | — | — | |
| | Train accuracy | 0.783 | — | — | |
|
|
| GBC delivers a **1.45× lift** over the majority baseline on top-1 |
| accuracy and a **2.92×** lift over majority for the practically more |
| useful top-3 accuracy (which is what feeds the strategy engine's |
| portfolio prompting). |
|
|
| ## Top features (GBC importance) |
|
|
| | Feature | Importance | |
| |---|---| |
| | `A06_total_submissions` | 0.291 | |
| | `A11_theme_keywords` | 0.075 | |
| | `A09_num_prize_categories` | 0.068 | |
| | `A05_prize_pool_usd` | 0.064 | |
|
|
| ## Loading the model |
|
|
| ```python |
| import joblib |
| |
| bundle = joblib.load("regime_classifier.pkl") |
| gbc = bundle["gbc"] # primary classifier |
| lr = bundle["lr"] # logistic-regression baseline |
| le = bundle["le"] # LabelEncoder for archetype names |
| scaler = bundle["scaler"] # StandardScaler (event features) |
| feature_cols = bundle["feature_cols"] # list of 32 input column names |
| |
| # Score a new event |
| import numpy as np |
| event_features_dict = {...} # build from your crawled event |
| X_raw = np.array([[event_features_dict[c] for c in feature_cols]]) |
| X = scaler.transform(X_raw) |
| proba = gbc.predict_proba(X)[0] |
| top3_idx = proba.argsort()[::-1][:3] |
| top3_archetypes = le.inverse_transform(top3_idx) |
| print(list(zip(top3_archetypes, proba[top3_idx]))) |
| ``` |
|
|
| ## Reproducing this artifact |
|
|
| The full training pipeline is in the open-source companion repo: |
|
|
| ```bash |
| pip install hackalpha |
| hackalpha train-model1 \ |
| --features data/merged/alphahack_features_v7.parquet \ |
| --model-out data/models/regime_classifier.pkl \ |
| --metrics-out data/research/model1_training_metrics.json \ |
| --report-out data/research/model1_archetype_report.json |
| ``` |
|
|
| The training metrics (`model1_training_metrics.json`) and label |
| distribution (`model1_archetype_report.json`) are included in this |
| HF directory. |
|
|
| ## Known failure modes |
|
|
| - Top-1 accuracy of 38% is well below human-expert level. The |
| product-relevant metric is top-3 (80%), used to prompt a multi-idea |
| portfolio rather than commit to one bet. |
| - 3 test years available (2024–2026) is not enough for tight CIs on |
| Model 1's archetype labels. |
| - The label assignment is **heuristic-based** (a project is "tech_showoff" |
| if its rubric scores match a tech-showoff signature), not adjudicated |
| by humans. Label noise is real. |
| |
| ## Limitations |
| |
| - Trained only on Devpost-hosted, English-language hackathons. |
| - In-person and non-Devpost events: performance unknown. |
| - Companion model 2 had a **prospective trial in April 2026 that did |
| not produce a prize**. Use both models as research artifacts, not |
| as a guaranteed winning recipe. |
| |
| ## License |
| |
| CC BY 4.0. |
| |