BrandSpotter: Logo Detection and Brand Identification for Sports Broadcasting

BrandSpotter is a multi-stage computer vision pipeline built to detect and identify brand logos in broadcast video, with applications in sponsor visibility measurement and digital out-of-home (DOOH) advertising analytics.

Problem: Broadcasters and sponsors need to quantify how often, and how clearly, brand logos appear on screen during live sports. This requires detecting logos at broadcast speed, classifying them by brand, and handling real-world challenges like motion blur, partial occlusion, camera angle variation, and lighting washout.

Approach: A three-stage pipeline:

YOLO11m for single-class logo region detection (this repo)
ResNet50 for brand classification with open-set rejection (coming soon)
Frame-level analytics for dwell-time measurement and visibility scoring

Source code: github.com/daa2618/brandspotter

Models

YOLO11m: Logo Detection (`yolo/`)

Fine-tuned YOLO11m for single-class logo detection on LogoDet-3K.

Metric	Value
mAP@0.5	0.894
mAP@0.5:0.95	0.639
Precision	0.829
Recall	0.863

Training configuration:

Base model: yolo11m.pt (COCO-pretrained)
Epochs: 50 (best checkpoint at epoch 47)
Image size: 640x640
Optimizer: AdamW (auto-selected)
Learning rate: 0.001
Batch size: auto
Hardware: Google Colab T4 GPU (~2 hours)
Dataset: LogoDet-3K (158,652 images, 3,000 classes collapsed to single "logo" class)
Augmentation: mosaic, RandAugment, erasing (0.4), horizontal flip (0.5)

Design rationale: A single-class detector maximises recall across all logo types, delegating brand-specific classification to the downstream ResNet stage. This separation allows the detector to generalise to unseen brands without retraining.

ResNet50: Brand Classification (`resnet/`)

Coming soon. A fine-tuned ResNet50 classifier trained on a curated brand dictionary with open-set rejection for unknown or novel logos.

Quick Start

from ultralytics import YOLO

# Load directly from HuggingFace
model = YOLO("hf://vectorized-dev/brandspotter/yolo/best.pt")

# Run inference
results = model("path/to/image.jpg")
results[0].show()

Repository Contents

yolo/
  best.pt       # Trained weights (best checkpoint, ~39 MB)
  args.yaml     # Full training arguments
  results.csv   # Per-epoch training metrics

Roadmap

ResNet50 brand classifier weights and evaluation
Open-set rejection threshold calibration
End-to-end inference script (detect + classify + dwell-time)
Sample results on sports broadcast footage
Dataset card for curated brand dictionary

Dataset

LogoDet-3K (Wang et al., ACM TOMM 2022). 158,652 images across 3,000 logo classes. The detection model treats all logos as a single class for region proposal; brand identification is handled downstream.

Citation

@article{wang2022logodet3k,
  title={LogoDet-3K: A Large-scale Image Dataset for Logo Detection},
  author={Wang, Jing and Min, Weiqing and Hou, Sujuan and Ma, Shengnan and Zheng, Yuanjie and Jiang, Shuqiang},
  journal={ACM Transactions on Multimedia Computing, Communications, and Applications},
  volume={18},
  number={3},
  year={2022},
  publisher={ACM}
}

License

MIT

Downloads last month: 45