Bengali OCR โ Word-Level Detection (v3)
Architecture: YOLOv8n | Task: Word-level bounding box detection
Results
| mAP@0.5 | Precision | Recall |
|---|---|---|
| 0.9223 | 0.9533 | 0.8722 |
Training data
- ICDAR 2019 MLT Bengali (real word boxes)
- 6,000 synthetic printed pages (NID/form/paragraph style)
Usage
from ultralytics import YOLO
from huggingface_hub import hf_hub_download
path = hf_hub_download("Sarjinkhan2003/bengali-ocr-detection", "bengali_det.pt")
model = YOLO(path)
results = model.predict("doc.jpg", conf=0.25)
for box in results[0].boxes:
print(box.xyxy[0].tolist()) # one word per box