| --- |
| license: mit |
| pipeline_tag: object-detection |
| library_name: ultralytics |
| datasets: |
| - LogoDet-3K |
| tags: |
| - yolo |
| - yolo11 |
| - logo-detection |
| - logodet-3k |
| - brandspotter |
| - brand-detection |
| - sports-broadcasting |
| - dooh |
| - computer-vision |
| --- |
| |
| # BrandSpotter: Logo Detection and Brand Identification for Sports Broadcasting |
|
|
| BrandSpotter is a multi-stage computer vision pipeline built to detect and |
| identify brand logos in broadcast video, with applications in sponsor visibility |
| measurement and digital out-of-home (DOOH) advertising analytics. |
|
|
| **Problem:** Broadcasters and sponsors need to quantify how often, and how |
| clearly, brand logos appear on screen during live sports. This requires |
| detecting logos at broadcast speed, classifying them by brand, and handling |
| real-world challenges like motion blur, partial occlusion, camera angle |
| variation, and lighting washout. |
|
|
| **Approach:** A three-stage pipeline: |
|
|
| 1. **YOLO11m** for single-class logo region detection (this repo) |
| 2. **ResNet50** for brand classification with open-set rejection (coming soon) |
| 3. **Frame-level analytics** for dwell-time measurement and visibility scoring |
|
|
| Source code: [github.com/daa2618/brandspotter](https://github.com/daa2618/brandspotter) |
|
|
| ## Models |
|
|
| ### YOLO11m: Logo Detection (`yolo/`) |
|
|
| Fine-tuned YOLO11m for single-class logo detection on |
| [LogoDet-3K](https://github.com/Wangjing1551/LogoDet-3K-Dataset). |
|
|
| | Metric | Value | |
| |----------------|-----------| |
| | mAP@0.5 | **0.894** | |
| | mAP@0.5:0.95 | **0.639** | |
| | Precision | 0.829 | |
| | Recall | 0.863 | |
|
|
| **Training configuration:** |
|
|
| - Base model: `yolo11m.pt` (COCO-pretrained) |
| - Epochs: 50 (best checkpoint at epoch 47) |
| - Image size: 640x640 |
| - Optimizer: AdamW (auto-selected) |
| - Learning rate: 0.001 |
| - Batch size: auto |
| - Hardware: Google Colab T4 GPU (~2 hours) |
| - Dataset: LogoDet-3K (158,652 images, 3,000 classes collapsed to single "logo" class) |
| - Augmentation: mosaic, RandAugment, erasing (0.4), horizontal flip (0.5) |
|
|
| **Design rationale:** A single-class detector maximises recall across all logo |
| types, delegating brand-specific classification to the downstream ResNet |
| stage. This separation allows the detector to generalise to unseen brands |
| without retraining. |
|
|
| ### ResNet50: Brand Classification (`resnet/`) |
|
|
| *Coming soon.* A fine-tuned ResNet50 classifier trained on a curated |
| brand dictionary with open-set rejection for unknown or novel logos. |
|
|
| ## Quick Start |
|
|
| ```python |
| from ultralytics import YOLO |
| |
| # Load directly from HuggingFace |
| model = YOLO("hf://vectorized-dev/brandspotter/yolo/best.pt") |
| |
| # Run inference |
| results = model("path/to/image.jpg") |
| results[0].show() |
| ``` |
|
|
| ## Repository Contents |
|
|
| ``` |
| yolo/ |
| best.pt # Trained weights (best checkpoint, ~39 MB) |
| args.yaml # Full training arguments |
| results.csv # Per-epoch training metrics |
| ``` |
|
|
| ## Roadmap |
|
|
| - [ ] ResNet50 brand classifier weights and evaluation |
| - [ ] Open-set rejection threshold calibration |
| - [ ] End-to-end inference script (detect + classify + dwell-time) |
| - [ ] Sample results on sports broadcast footage |
| - [ ] Dataset card for curated brand dictionary |
|
|
| ## Dataset |
|
|
| [LogoDet-3K](https://github.com/Wangjing1551/LogoDet-3K-Dataset) (Wang et al., |
| ACM TOMM 2022). 158,652 images across 3,000 logo classes. The detection model |
| treats all logos as a single class for region proposal; brand identification is |
| handled downstream. |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{wang2022logodet3k, |
| title={LogoDet-3K: A Large-scale Image Dataset for Logo Detection}, |
| author={Wang, Jing and Min, Weiqing and Hou, Sujuan and Ma, Shengnan and Zheng, Yuanjie and Jiang, Shuqiang}, |
| journal={ACM Transactions on Multimedia Computing, Communications, and Applications}, |
| volume={18}, |
| number={3}, |
| year={2022}, |
| publisher={ACM} |
| } |
| ``` |
|
|
| ## License |
|
|
| MIT |