ArXiv |

Models trained on BMD-45 deliver up to 2.5x performance improvement over UA-DETRAC baselines

BMD-45 Vehicle Detection Models — AIM@IISc

High-quality object detection models built for Indian road traffic — where vehicle appearance, traffic density, and scene complexity differ significantly from Western datasets like COCO.

These models are trained on the BMD-45 dataset, featuring:

13 road-relevant vehicle categories
Real urban environments across India
Diverse viewpoints, lighting, occlusion & density variations
Multi-user labeled data with consensus filtering (MV / ST variants)

We currently release six SOTA detector variants trained on the dataset:

Model Family	Sizes	Strengths
YOLOv12	S, X	Fast + lightweight deployment
RT-DETRv2	X	High-accuracy, transformer-based real-time detection
RF-DETR	X	Region-focused DETR with strong small-object detection
D-FINE	X	Fine-grained detection with iterative refinement

Designed for Indian mobility — adaptable to real city surveillance, roadside cameras, safety monitoring, and ITS applications.

Model Dataset -> https://huggingface.co/datasets/iisc-aim/BMD-45

Attribution

to be added

Repository Structure

README.md – This file
bmd_classes.txt – 13 object classes (one per line)
configs/ – Model configuration files
- YOLOv12-S/
  - config.yaml – Training hyperparameters
  - data.yaml – Dataset paths and class names
- YOLOv12-X/
  - config.yaml – Training hyperparameters
  - data.yaml – Dataset paths and class names
- RT-DETRv2/
  - bmd-45-dataset.yaml – Dataset configuration
  - rtdetrv2_r101vd_6x_bmd-45.yaml – Model + training configuration
- RF-DETR/
  - config.yaml – Training hyperparameters
- D-FINE/
  - bmd-45-dataset.yaml – Dataset configuration
  - dfine_hgnetv2_x_bmd-45.yaml – Model + training configuration
weights/ – Trained model weights
- YOLOv12-S/ – best.pt
- YOLOv12-X/ – best.pt
- RT-DETRv2/ – best.pth
- RF-DETR/ – checkpoint_best_total.pth
- D-FINE/ – best_stg1.pth

Classes

The file uvh_classes.txt lists all 14 object categories, one per line:

ID	Class Name	Description
1	Hatchback	Small passenger cars without a protruding rear boot (“dickey”).
2	Sedan	Passenger cars with a low-slung design and a separate protruding rear boot (“dickey”).
3	SUV	Car-like vehicles with high ground clearance, a sturdy body, and no protruding boot.
4	MUV	Large vehicles with three seating rows, combining passenger and cargo functionality.
5	Bus	Large passenger vehicles used for public or private transport, including office shuttles and intercity buses.
6	Truck	Heavy goods carriers with a front cabin and a rear cargo compartment.
7	Three-wheeler	Compact vehicles with one front wheel and two rear wheels, featuring a covered passenger cabin.
8	Two-wheeler	Motorbikes and scooters for single or double riders. Bounding boxes include both vehicle and rider.
9	LCV	Lightweight goods carriers used for short- to medium-distance transport.
10	Mini-bus	Shorter, compact buses with fewer seats; larger than a Tempo Traveller, often featuring a flat front.
11	Tempo-traveller	Medium-sized passenger vans with tall roofs and side windows; larger than vans but smaller than minibuses, with a protruding front.
12	Bicycle	Non-motorized, manually pedalled vehicles including geared, non-geared, women’s, and children’s cycles. Bounding boxes include both vehicle and rider.
13	Van	Medium-sized vehicles for transporting goods or people, typically with a flat front and sliding side doors; smaller than Tempo Travellers.

Training Hyperparameters and Architecture

All models were trained on the BMD-45 dataset with identical batch sizes and consistent augmentation settings for fair comparison.

Setting	YOLOv12-S	YOLOv12-X	RT-DETRv2-X	D-FINE-X	RF-DETR-X
Batch Size	16	16	16	16	16
Epochs	100	100	100	100	100
Learning Rate	0.01	0.01	1×10⁻⁴	2.5×10⁻⁴	1×10⁻⁴
Optimizer	AdamW	AdamW	AdamW	AdamW	AdamW
Weight Decay	5×10⁻⁴	5×10⁻⁴	1×10⁻⁴	1.25×10⁻⁴	1×10⁻⁴
AdamW Betas	(0.937, 0.999)	(0.937, 0.999)	(0.9, 0.999)	(0.9, 0.999)	(0.9, 0.999)
LR Policy	Cosine	Cosine	MultiStep	MultiStep	Step LR
Warmup	3 epochs	3 epochs	2000-iteration linear warmup	500-step linear warmup	None
Warmup Details	momentum=0.8; bias LR=0.1	momentum=0.8; bias LR=0.1	momentum untouched; uniform LR ramp	no bias/momentum overrides	warmup disabled
Augmentation Summary	HSV, translate=0.1, scale=0.5, flip=0.5, erase=0.4; no mosaic/mixup	HSV, translate=0.1, scale=0.5, flip=0.5, erase=0.4; no mosaic/mixup	Photometric, ZoomOut, IoU crop; ops disabled after epoch 151	Photometric, ZoomOut, IoU crop, flip, sanitize, resize	Flip + multi-scale RandomResize/Crop + normalize

License

This repository (models, weights, configs) is released under the Apache License 2.0.
Note: The underlying YOLO-family models (e.g., YOLOv12) from Ultralytics are distributed under the GNU AGPL v3.0 (or newer) license.

Downloads last month: -; Downloads are not tracked for this model. How to track

Paper for iisc-aim/BMD-45

The Urban Vision Hackathon Dataset and Models: Towards Image Annotations and Accurate Vision Models for Indian Traffic

Paper • 2511.02563 • Published Nov 4, 2025 • 3