You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Vietnamese ID Document Detection Model

Abstract

This repository contains a Detectron2-based object detection model trained to localize Vietnamese identity documents in images. The detector predicts a single class, document, and is intended as a document-localization stage before downstream processing such as perspective correction, background removal, or OCR.

Model

  • Architecture: Faster R-CNN R50 FPN 3x
  • Framework: Detectron2
  • Detection class: document
  • Input: natural images containing Vietnamese identity documents
  • Output: bounding box coordinates with confidence scores

Training Data

The model was trained on a normalized COCO-format dataset merged from 6 local sources:

  • archive
  • behind cccd.v6i.coco
  • CCCD Project.v2i.coco
  • cccd.v2i.coco
  • detect cccd.v2i.coco
  • Project2.v2i.coco

Merged dataset summary:

  • Train split: 3354 images / 3347 annotations
  • Empty train images filtered by the trainer: 21
  • Effective annotated train images used for optimization: 3333
  • Validation split: 458 images / 452 annotations

Training Setup

  • Base learning rate: 0.00025
  • Maximum iterations: 18000
  • Learning-rate decay milestones: 14400, 16200
  • Batch size: 2
  • Multi-scale train resize: 640,672,704,736,768,800
  • Max image size: 1333

Evaluation

The final validation metrics below are from the last Detectron2 evaluation at iteration 18000.

Metric Value
AP (bbox) 95.151
AP50 (bbox) 98.101
AP75 (bbox) 98.086
APl (bbox) 95.151

Raw training and evaluation logs are included in metrics.json.

Intended Use

This model is designed for document localization, especially as a preprocessing step before cropping or OCR. It is optimized for finding the document region, not for classifying document side, document type, or extracting text content.

Notes

This repository stores the model artifacts and model card. Training and inference scripts are maintained in the source project repository used to train the model. A TorchScript export artifact may also be included alongside the PyTorch checkpoint.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support