license: cc-by-4.0
library_name: scikit-learn
tags:
- hackathon
- tabular-classification
- archetype
- alphahack
pipeline_tag: tabular-classification
AlphaHack Model 1 — Event Regime Classifier
A GradientBoostingClassifier (with a logistic-regression fallback,
LabelEncoder, and StandardScaler bundled in the same pickle) that
predicts which winner archetype dominates a given hackathon event.
Companion: Model 2 — winner predictor
Model description
Given event-level features (host type, judge composition, prize pool, duration, theme keywords, criteria text, etc.), the classifier predicts which of 5 archetypes describes the event's prior winners:
| Archetype | Train rows |
|---|---|
tech_showoff |
1,051 |
empathy_play |
825 |
scrappy_utility |
761 |
hype_surfer |
717 |
narrative_master |
658 |
Use the model to inform idea-generation strategy before a hackathon — not to rank individual project submissions (use Model 2 for that).
Training data
The model was fit on 4,012 event-level archetype labels derived from
23,785 winning-project rows across 101,682 total projects in
the xenosaac/alphahack-devpost
dataset. Source feature parquet:
data/merged/alphahack_features_v7.parquet (151 columns post-PII-scrub).
Metrics
5-fold cross-validation, GroupKFold by event_id (no event appears in
both train and test of the same fold).
| Metric | GBC | LR baseline | Majority baseline |
|---|---|---|---|
| Top-1 accuracy | 0.381 ± 0.012 | 0.275 | 0.262 |
| Top-3 accuracy | 0.804 ± 0.010 | — | — |
| Train accuracy | 0.783 | — | — |
GBC delivers a 1.45× lift over the majority baseline on top-1 accuracy and a 2.92× lift over majority for the practically more useful top-3 accuracy (which is what feeds the strategy engine's portfolio prompting).
Top features (GBC importance)
| Feature | Importance |
|---|---|
A06_total_submissions |
0.291 |
A11_theme_keywords |
0.075 |
A09_num_prize_categories |
0.068 |
A05_prize_pool_usd |
0.064 |
Loading the model
import joblib
bundle = joblib.load("regime_classifier.pkl")
gbc = bundle["gbc"] # primary classifier
lr = bundle["lr"] # logistic-regression baseline
le = bundle["le"] # LabelEncoder for archetype names
scaler = bundle["scaler"] # StandardScaler (event features)
feature_cols = bundle["feature_cols"] # list of 32 input column names
# Score a new event
import numpy as np
event_features_dict = {...} # build from your crawled event
X_raw = np.array([[event_features_dict[c] for c in feature_cols]])
X = scaler.transform(X_raw)
proba = gbc.predict_proba(X)[0]
top3_idx = proba.argsort()[::-1][:3]
top3_archetypes = le.inverse_transform(top3_idx)
print(list(zip(top3_archetypes, proba[top3_idx])))
Reproducing this artifact
The full training pipeline is in the open-source companion repo:
pip install hackalpha
hackalpha train-model1 \
--features data/merged/alphahack_features_v7.parquet \
--model-out data/models/regime_classifier.pkl \
--metrics-out data/research/model1_training_metrics.json \
--report-out data/research/model1_archetype_report.json
The training metrics (model1_training_metrics.json) and label
distribution (model1_archetype_report.json) are included in this
HF directory.
Known failure modes
- Top-1 accuracy of 38% is well below human-expert level. The product-relevant metric is top-3 (80%), used to prompt a multi-idea portfolio rather than commit to one bet.
- 3 test years available (2024–2026) is not enough for tight CIs on Model 1's archetype labels.
- The label assignment is heuristic-based (a project is "tech_showoff" if its rubric scores match a tech-showoff signature), not adjudicated by humans. Label noise is real.
Limitations
- Trained only on Devpost-hosted, English-language hackathons.
- In-person and non-Devpost events: performance unknown.
- Companion model 2 had a prospective trial in April 2026 that did not produce a prize. Use both models as research artifacts, not as a guaranteed winning recipe.
License
CC BY 4.0.