Football Match Result Predictor

A PyTorch MLP classifier that predicts the outcome of international football matches: Away Win, Draw, or Home Win.

Model Details

Field	Value
Author	Venus3009
Framework	PyTorch
Task	Multi-class classification (3 classes)
Test Accuracy	55.77%
Majority-class baseline	49.0%
Random baseline	33.3%

Dataset

International football results 1872–2026

49,071 international matches
Date range: 1872-11-30 → 2026-01-26
Source: Kaggle (martj42)

Architecture

Input (11) → Linear(256) → BatchNorm → ReLU → Dropout(0.3)
           → Linear(128) → BatchNorm → ReLU → Dropout(0.2)
           → Linear(64)  → ReLU
           → Linear(3)   → Softmax

Parameters: 45,187
Optimizer: Adam (lr=1e-3, weight_decay=1e-4)
Loss: CrossEntropyLoss
Scheduler: ReduceLROnPlateau (patience=3)
Epochs: 30

Features (11 total)

Feature	Description
`home_enc`	Label-encoded home team name
`away_enc`	Label-encoded away team name
`tour_enc`	Label-encoded tournament name
`neutral_n`	1 if played at a neutral venue, 0 otherwise
`year`	Match year (captures long-term team evolution)
`home_winrate`	Home team win rate over last 20 games
`home_gf`	Home team avg goals scored over last 20 games
`home_ga`	Home team avg goals conceded over last 20 games
`away_winrate`	Away team win rate over last 20 games
`away_gf`	Away team avg goals scored over last 20 games
`away_ga`	Away team avg goals conceded over last 20 games

All features are normalized with StandardScaler before training.

Output Classes

Index	Label	Meaning
0	Away Win	Away team wins
1	Draw	Match ends in a draw
2	Home Win	Home team wins

Usage

import torch
import torch.nn as nn
import numpy as np
from sklearn.preprocessing import StandardScaler

class FootballPredictor(nn.Module):
    def __init__(self, n_features=11):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(n_features, 256), nn.BatchNorm1d(256), nn.ReLU(), nn.Dropout(0.3),
            nn.Linear(256, 128),        nn.BatchNorm1d(128), nn.ReLU(), nn.Dropout(0.2),
            nn.Linear(128, 64),         nn.ReLU(),
            nn.Linear(64, 3)
        )
    def forward(self, x):
        return self.net(x)

# Load model
model = FootballPredictor()
model.load_state_dict(torch.load('football_predictor.pth', map_location='cpu'))
model.eval()

def predict_match(features_scaled):
    """
    features_scaled: numpy array of shape (1, 11), already StandardScaler-transformed.
    Returns predicted class and probabilities.
    """
    with torch.no_grad():
        x     = torch.tensor(features_scaled, dtype=torch.float32)
        probs = torch.softmax(model(x), dim=1)[0].numpy()

    labels = ['Away Win', 'Draw', 'Home Win']
    for label, prob in zip(labels, probs):
        print(f"  {label:<12} {prob*100:.1f}%")
    print(f"  → Prediction: {labels[probs.argmax()]}")

Training Split

Method: Temporal (chronological order preserved — no random shuffle)
Train: first 80% of matches (up to ~2019)
Test: last 20% of matches (2019–2026)

Temporal splitting is critical here: a random split would leak future matches into training and inflate accuracy.

Limitations

No player-level data (injuries, suspensions, squad strength)
No ELO / FIFA ranking features
Team encodings are static — the same team in 1950 and 2026 is treated identically
Class imbalance (~49% home wins) is not corrected during training

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support