🧾 Model Card — CivicAi-YOLO11m-v1

🧠 Model Overview

PotholeNet-YOLO11m-v1 is a fine-tuned object detection model built on Ultralytics YOLO11m architecture, specifically trained to detect potholes, road damage, and garbage from street-level imagery. The model leverages YOLO11m's C2PSA (Cross-Stage Partial with Spatial Attention) mechanism, making it highly effective at identifying irregular-shaped urban defects like potholes.

Trained on a large-scale, curated civic infrastructure dataset of 23,000+ street-level images from Indian urban environments, this model is designed to power real-time civic issue detection systems, enabling automated reporting and faster municipal response.

It serves as the Detection Layer (Layer 1) of the Aamchi City AI Civic System — an end-to-end intelligent dashboard for urban infrastructure monitoring.

🏗️ Training Details

Parameter	Value
Base Model	`yolo11m.pt` (COCO pretrained)
Architecture	YOLO11m (C3k2 + C2PSA Spatial Attention)
Framework	Ultralytics v8.x
Training Hardware	Kaggle — NVIDIA T4 ×2 (Dual GPU)
Epochs	50
Input Resolution	768×768
Batch Size	Auto (`batch=-1`)
Optimizer	AdamW
Learning Rate	`lr0=0.001`, cosine decay to `lrf=0.01`
Warmup	3 epochs
Weight Decay	0.0005
AMP	Enabled (FP16 mixed precision)
Early Stopping	`patience=10` (did not trigger — model was still improving)

Loss Weights

Loss	Weight
Box Loss	7.5
Classification Loss	1.0
DFL Loss	1.5

Augmentation Pipeline

Augmentation	Value
Mosaic	1.0
MixUp	0.15
Copy-Paste	0.1
HSV (H/S/V)	0.015 / 0.7 / 0.4
Rotation	±10°
Scale	0.5
Shear	2.0
Horizontal Flip	0.5
Erasing	0.3
Label Smoothing	0.05
Close Mosaic	Last 8 epochs

📊 Dataset Description

The model was trained on a curated subset of 23,179 street-level images collected from Indian urban environments. The dataset underwent extensive preprocessing:

Perceptual Hash (pHash) Deduplication — Removed near-duplicate images using hamming distance ≤ 4
Corrupt Image Removal — Verified all images via PIL
Intelligent Negative Sampling — Trimmed empty-label (background) images to 2,000 hard negatives
Stratified Split — 80% Train / 15% Val / 5% Test, stratified by dominant class

Label Classes

Class ID	Class Name	Description
🔴 0	Pothole	Road surface cavities and depressions
🟡 1	Road Damage	Cracks, surface wear, and structural deterioration
🟢 2	Garbage	Street-level waste and debris accumulation

Priority: Pothole (primary) > Garbage > Road Damage

🎯 Evaluation Metrics

Metric	Score
mAP50	0.86
mAP50-95	—
Parameters	~20M
Model Size	~39 MB
Inference Speed	Real-time on GPU

⚡ The model did not trigger early stopping at 50 epochs, indicating further training could yield additional performance gains.

💬 Example Usage

Python (Ultralytics)

from ultralytics import YOLO

# Load model
model = YOLO("best.pt")

# Run inference
results = model("street_image.jpg", imgsz=768, conf=0.25)

# Display results
results[0].show()

# Access detections
for box in results[0].boxes:
    cls = int(box.cls)
    conf = float(box.conf)
    xyxy = box.xyxy[0].tolist()
    class_names = {0: "pothole", 1: "road_damage", 2: "garbage"}
    print(f"{class_names[cls]}: {conf:.2f} at {xyxy}")

With Test-Time Augmentation (TTA)

# TTA boosts mAP by +1-3% at the cost of inference speed
results = model("street_image.jpg", imgsz=768, conf=0.25, augment=True)

Filter Pothole-Only Detections

results = model("street_image.jpg", conf=0.25)
boxes = results[0].boxes
pothole_mask = boxes.cls == 0
pothole_boxes = boxes[pothole_mask]
print(f"Found {len(pothole_boxes)} potholes")

🧩 Intended Use

Real-time pothole detection from dashcam, mobile phone, or street-view imagery
Automated civic issue reporting — GPS-tagged detection for municipal dashboards
Infrastructure health monitoring — Severity scoring and trend analysis for road maintenance
Smart city integration — Layer 1 detection input for AI-driven civic action systems
Mobile deployment — Exportable to ONNX for edge inference on mobile devices

⚠️ Limitations

The model is optimized for Indian urban road conditions; performance may degrade on highways, rural roads, or non-Indian geographies.
Road damage class has visual overlap with potholes, which may cause occasional misclassification between the two.
Performance is best on daytime, clear-weather imagery — low-light and rain-occluded scenes may reduce accuracy.
The model was trained for 50 epochs without early stopping trigger, suggesting the checkpoint is not fully converged and further fine-tuning could improve results.
Small potholes (< 32px at 768px resolution) may be missed in wide-angle shots.

🧑‍💻 Developer


Author	Vansh Momaya
Institution	D. J. Sanghvi College of Engineering
Focus Area	Computer Vision, Object Detection, AI for Civic Infrastructure
Email	vanshmomaya9@gmail.com

🌍 Citation

If you use PotholeNet-YOLO11m-v1 in your research or project:

@online{momaya2026potholenet,
  author       = {Vansh Momaya},
  title        = {PotholeNet-YOLO11m-v1: Real-Time Pothole and Civic Issue Detection for Indian Urban Roads},
  year         = {2026},
  version      = {v1},
  url          = {https://huggingface.co/Vansh180/PotholeNet-YOLO11m-v1},
  institution  = {D. J. Sanghvi College of Engineering},
  note         = {Fine-tuned YOLO11m model for detecting potholes, road damage, and garbage in Indian street imagery},
  license      = {MIT}
}

🚀 Acknowledgements

Ultralytics YOLO11 — Base architecture and training framework
Kaggle — Training infrastructure (Dual T4 GPU)
Aamchi City — Datahack 4 — Hackathon context and dataset

Built for the Aamchi City AI Civic System — Datahack 4, PS2 Core ML

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support