YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

🧾 Model Card β€” CivicAi-YOLO11m-v1

🧠 Model Overview

PotholeNet-YOLO11m-v1 is a fine-tuned object detection model built on Ultralytics YOLO11m architecture, specifically trained to detect potholes, road damage, and garbage from street-level imagery. The model leverages YOLO11m's C2PSA (Cross-Stage Partial with Spatial Attention) mechanism, making it highly effective at identifying irregular-shaped urban defects like potholes.

Trained on a large-scale, curated civic infrastructure dataset of 23,000+ street-level images from Indian urban environments, this model is designed to power real-time civic issue detection systems, enabling automated reporting and faster municipal response.

It serves as the Detection Layer (Layer 1) of the Aamchi City AI Civic System β€” an end-to-end intelligent dashboard for urban infrastructure monitoring.


πŸ—οΈ Training Details

Parameter Value
Base Model yolo11m.pt (COCO pretrained)
Architecture YOLO11m (C3k2 + C2PSA Spatial Attention)
Framework Ultralytics v8.x
Training Hardware Kaggle β€” NVIDIA T4 Γ—2 (Dual GPU)
Epochs 50
Input Resolution 768Γ—768
Batch Size Auto (batch=-1)
Optimizer AdamW
Learning Rate lr0=0.001, cosine decay to lrf=0.01
Warmup 3 epochs
Weight Decay 0.0005
AMP Enabled (FP16 mixed precision)
Early Stopping patience=10 (did not trigger β€” model was still improving)

Loss Weights

Loss Weight
Box Loss 7.5
Classification Loss 1.0
DFL Loss 1.5

Augmentation Pipeline

Augmentation Value
Mosaic 1.0
MixUp 0.15
Copy-Paste 0.1
HSV (H/S/V) 0.015 / 0.7 / 0.4
Rotation Β±10Β°
Scale 0.5
Shear 2.0
Horizontal Flip 0.5
Erasing 0.3
Label Smoothing 0.05
Close Mosaic Last 8 epochs

πŸ“Š Dataset Description

The model was trained on a curated subset of 23,179 street-level images collected from Indian urban environments. The dataset underwent extensive preprocessing:

  • Perceptual Hash (pHash) Deduplication β€” Removed near-duplicate images using hamming distance ≀ 4
  • Corrupt Image Removal β€” Verified all images via PIL
  • Intelligent Negative Sampling β€” Trimmed empty-label (background) images to 2,000 hard negatives
  • Stratified Split β€” 80% Train / 15% Val / 5% Test, stratified by dominant class

Label Classes

Class ID Class Name Description
πŸ”΄ 0 Pothole Road surface cavities and depressions
🟑 1 Road Damage Cracks, surface wear, and structural deterioration
🟒 2 Garbage Street-level waste and debris accumulation

Priority: Pothole (primary) > Garbage > Road Damage


🎯 Evaluation Metrics

Metric Score
mAP50 0.86
mAP50-95 β€”
Parameters ~20M
Model Size ~39 MB
Inference Speed Real-time on GPU

⚑ The model did not trigger early stopping at 50 epochs, indicating further training could yield additional performance gains.


πŸ’¬ Example Usage

Python (Ultralytics)

from ultralytics import YOLO

# Load model
model = YOLO("best.pt")

# Run inference
results = model("street_image.jpg", imgsz=768, conf=0.25)

# Display results
results[0].show()

# Access detections
for box in results[0].boxes:
    cls = int(box.cls)
    conf = float(box.conf)
    xyxy = box.xyxy[0].tolist()
    class_names = {0: "pothole", 1: "road_damage", 2: "garbage"}
    print(f"{class_names[cls]}: {conf:.2f} at {xyxy}")

With Test-Time Augmentation (TTA)

# TTA boosts mAP by +1-3% at the cost of inference speed
results = model("street_image.jpg", imgsz=768, conf=0.25, augment=True)

Filter Pothole-Only Detections

results = model("street_image.jpg", conf=0.25)
boxes = results[0].boxes
pothole_mask = boxes.cls == 0
pothole_boxes = boxes[pothole_mask]
print(f"Found {len(pothole_boxes)} potholes")

🧩 Intended Use

  • Real-time pothole detection from dashcam, mobile phone, or street-view imagery
  • Automated civic issue reporting β€” GPS-tagged detection for municipal dashboards
  • Infrastructure health monitoring β€” Severity scoring and trend analysis for road maintenance
  • Smart city integration β€” Layer 1 detection input for AI-driven civic action systems
  • Mobile deployment β€” Exportable to ONNX for edge inference on mobile devices

⚠️ Limitations

  • The model is optimized for Indian urban road conditions; performance may degrade on highways, rural roads, or non-Indian geographies.
  • Road damage class has visual overlap with potholes, which may cause occasional misclassification between the two.
  • Performance is best on daytime, clear-weather imagery β€” low-light and rain-occluded scenes may reduce accuracy.
  • The model was trained for 50 epochs without early stopping trigger, suggesting the checkpoint is not fully converged and further fine-tuning could improve results.
  • Small potholes (< 32px at 768px resolution) may be missed in wide-angle shots.

πŸ§‘β€πŸ’» Developer

Author Vansh Momaya
Institution D. J. Sanghvi College of Engineering
Focus Area Computer Vision, Object Detection, AI for Civic Infrastructure
Email vanshmomaya9@gmail.com

🌍 Citation

If you use PotholeNet-YOLO11m-v1 in your research or project:

@online{momaya2026potholenet,
  author       = {Vansh Momaya},
  title        = {PotholeNet-YOLO11m-v1: Real-Time Pothole and Civic Issue Detection for Indian Urban Roads},
  year         = {2026},
  version      = {v1},
  url          = {https://huggingface.co/Vansh180/PotholeNet-YOLO11m-v1},
  institution  = {D. J. Sanghvi College of Engineering},
  note         = {Fine-tuned YOLO11m model for detecting potholes, road damage, and garbage in Indian street imagery},
  license      = {MIT}
}

πŸš€ Acknowledgements

  • Ultralytics YOLO11 β€” Base architecture and training framework
  • Kaggle β€” Training infrastructure (Dual T4 GPU)
  • Aamchi City β€” Datahack 4 β€” Hackathon context and dataset

Built for the Aamchi City AI Civic System β€” Datahack 4, PS2 Core ML

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support