YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
π Vehicle Detection using Ensemble YOLO + Weighted Boxes Fusion
M.Tech Thesis Project β Optimized vehicle detection using a Self-Adaptive 3-Tier WBF ensemble of YOLO11m + YOLO26m.
π Dataset
Source: Vehicle Dataset for YOLO (Kaggle)
| Property | Value |
|---|---|
| Total Images | 3,000 |
| Classes | 6 (balanced ~500 each) |
| Format | YOLO .txt (ready to use) |
| Original Split | 2,100 train / 900 valid |
| Pipeline Split | 60% train / 20% val / 20% test |
Classes
| ID | Class | Description |
|---|---|---|
| 0 | Car | Standard passenger car |
| 1 | Threewheel | Three-wheeler / tuk-tuk / auto-rickshaw |
| 2 | Bus | |
| 3 | Truck | |
| 4 | Motorbike | |
| 5 | Van |
π This dataset is balanced (~500 images per class) β no minority class augmentation needed (unlike DAWN).
π Repository Contents
| File | Description |
|---|---|
kaggle_notebook.py |
Complete Kaggle-ready pipeline β paste into notebook and run (~4-6 hrs) |
ποΈ Architecture
Models
- YOLO11m (20M params) β Ultralytics YOLO11 medium, COCO pretrained
- YOLO26m (20.4M params) β Ultralytics YOLO26 medium, COCO pretrained
Self-Adaptive 3-Tier WBF Ensemble
Input Image
ββββ YOLO11m βββ Detectionsβ
ββββ YOLO11m_ft βββ Detectionsβ
ββββ YOLO26m_ft βββ Detectionsβ
β
βββββββ΄ββββββ
β Tier 1 β Per-class F1-based weights (static, from val set)
β Tier 2 β Per-image confidence modulation (dynamic, Ξ±=0.1)
β Tier 3 β Log-dampened count normalization
βββββββ¬ββββββ
β
Weighted Boxes Fusion (iou=0.55, conf_type=box_and_model_avg)
β
Fused Detections
Tier 1: $w^{base}{i,j} = \frac{F1{i,j}}{\sum_{k} F1_{k,j}}$
Tier 2: $\phi_i(I) = 1 + \alpha \cdot (\bar{s}_i - 0.5)$
Tier 3: $\hat{s}d = \frac{s_d \cdot w^{base}{i,j} \cdot \phi_i(I)}{\log_2(\max(n_{i,j}, 2))}$
π§ Pipeline Phases
| Phase | Description | Est. Time |
|---|---|---|
| 1 | Download from Kaggle + re-split (60/20/20) | ~5 min |
| 2 | HP Search (4 trials Γ 15 epochs Γ 2 models) | ~45 min |
| 3 | Train YOLO11m (100 epochs) + fine-tune (30 epochs) | ~90 min |
| 4 | Train YOLO26m (100 epochs) + fine-tune (30 epochs) | ~90 min |
| 5 | WBF Ensemble calibration | ~15 min |
| 6 | Full evaluation (individual + ensemble) | ~15 min |
| 7 | Save all results | ~1 min |
π Quick Start (Kaggle)
# 1. Create Kaggle Notebook β GPU T4 x2
# 2. Paste kaggle_notebook.py contents into a cell
# 3. Run β results in /kaggle/working/results/
Output structure:
/kaggle/working/results/
βββ all_results.json # Complete metrics
βββ ensemble_config.json # WBF weights (loadable)
βββ hp_search/
β βββ yolo11m.json # HP search results
β βββ yolo26m.json
βββ weights/
βββ yolo11m_ft_best.pt # Fine-tuned YOLO11m
βββ yolo11m_best.pt # Base YOLO11m
βββ yolo26m_ft_best.pt # Fine-tuned YOLO26m
π Bug Fixes (from DAWN pipeline)
All fixes from the previous DAWN pipeline are pre-applied:
| Bug | Fix |
|---|---|
NoneType HP search crash |
Null-check + bounded HP + cache clearing |
_thread.lock pickle error |
JSON save/load (no YOLO serialization) |
| Ensemble metric drop | log2 normalization, no max-renorm, Ξ±=0.1 |
| No test split | Auto-creates 60/20/20 from combined train+valid |
π Citation
@misc{vehicle-yolo-wbf-2026,
title={Optimized Vehicle Detection using Ensemble YOLO and Weighted Boxes Fusion},
author={AmeenAktharT},
year={2026},
note={YOLO11m + YOLO26m with Self-Adaptive 3-Tier WBF Ensemble}
}
@article{solovyev2021wbf,
title={Weighted boxes fusion: Ensembling boxes from different object detection models},
author={Solovyev, Roman and Wang, Weimin and Gabruseva, Tatiana},
journal={Image and Vision Computing},
year={2021}
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support