--- license: mit pipeline_tag: object-detection library_name: ultralytics datasets: - LogoDet-3K tags: - yolo - yolo11 - logo-detection - logodet-3k - brandspotter - brand-detection - sports-broadcasting - dooh - computer-vision --- # BrandSpotter: Logo Detection and Brand Identification for Sports Broadcasting BrandSpotter is a multi-stage computer vision pipeline built to detect and identify brand logos in broadcast video, with applications in sponsor visibility measurement and digital out-of-home (DOOH) advertising analytics. **Problem:** Broadcasters and sponsors need to quantify how often, and how clearly, brand logos appear on screen during live sports. This requires detecting logos at broadcast speed, classifying them by brand, and handling real-world challenges like motion blur, partial occlusion, camera angle variation, and lighting washout. **Approach:** A three-stage pipeline: 1. **YOLO11m** for single-class logo region detection (this repo) 2. **ResNet50** for brand classification with open-set rejection (coming soon) 3. **Frame-level analytics** for dwell-time measurement and visibility scoring Source code: [github.com/daa2618/brandspotter](https://github.com/daa2618/brandspotter) ## Models ### YOLO11m: Logo Detection (`yolo/`) Fine-tuned YOLO11m for single-class logo detection on [LogoDet-3K](https://github.com/Wangjing1551/LogoDet-3K-Dataset). | Metric | Value | |----------------|-----------| | mAP@0.5 | **0.894** | | mAP@0.5:0.95 | **0.639** | | Precision | 0.829 | | Recall | 0.863 | **Training configuration:** - Base model: `yolo11m.pt` (COCO-pretrained) - Epochs: 50 (best checkpoint at epoch 47) - Image size: 640x640 - Optimizer: AdamW (auto-selected) - Learning rate: 0.001 - Batch size: auto - Hardware: Google Colab T4 GPU (~2 hours) - Dataset: LogoDet-3K (158,652 images, 3,000 classes collapsed to single "logo" class) - Augmentation: mosaic, RandAugment, erasing (0.4), horizontal flip (0.5) **Design rationale:** A single-class detector maximises recall across all logo types, delegating brand-specific classification to the downstream ResNet stage. This separation allows the detector to generalise to unseen brands without retraining. ### ResNet50: Brand Classification (`resnet/`) *Coming soon.* A fine-tuned ResNet50 classifier trained on a curated brand dictionary with open-set rejection for unknown or novel logos. ## Quick Start ```python from ultralytics import YOLO # Load directly from HuggingFace model = YOLO("hf://vectorized-dev/brandspotter/yolo/best.pt") # Run inference results = model("path/to/image.jpg") results[0].show() ``` ## Repository Contents ``` yolo/ best.pt # Trained weights (best checkpoint, ~39 MB) args.yaml # Full training arguments results.csv # Per-epoch training metrics ``` ## Roadmap - [ ] ResNet50 brand classifier weights and evaluation - [ ] Open-set rejection threshold calibration - [ ] End-to-end inference script (detect + classify + dwell-time) - [ ] Sample results on sports broadcast footage - [ ] Dataset card for curated brand dictionary ## Dataset [LogoDet-3K](https://github.com/Wangjing1551/LogoDet-3K-Dataset) (Wang et al., ACM TOMM 2022). 158,652 images across 3,000 logo classes. The detection model treats all logos as a single class for region proposal; brand identification is handled downstream. ## Citation ```bibtex @article{wang2022logodet3k, title={LogoDet-3K: A Large-scale Image Dataset for Logo Detection}, author={Wang, Jing and Min, Weiqing and Hou, Sujuan and Ma, Shengnan and Zheng, Yuanjie and Jiang, Shuqiang}, journal={ACM Transactions on Multimedia Computing, Communications, and Applications}, volume={18}, number={3}, year={2022}, publisher={ACM} } ``` ## License MIT