punnerud
/

ainm-object-detection

@@ -1,9 +1,71 @@
 ---
 license: mit
 ---
-# NM i AI 2026 - NorgesGruppen Object Detection
-YOLOv8x multi-class detector (356 grocery product categories) with multi-scale TTA.
-Best leaderboard score: **0.9230**

 ---
 license: mit
+tags:
+  - object-detection
+  - yolov8
+  - grocery
+  - retail
+  - onnx
+datasets:
+  - custom
+pipeline_tag: object-detection
 ---
+# NM i AI 2026 — NorgesGruppen Object Detection
+Multi-class YOLOv8x detector for 356 grocery product categories on store shelf images.
+## Performance
+| Method | Leaderboard Score |
+|--------|------------------|
+| Multi-scale TTA (640+960+1280 + flip) | **0.9230** |
+| Single inference | 0.8922 |
+Competition scoring:
+## Model Details
+- **Architecture:** YOLOv8x (68.5M parameters)
+- **Classes:** 356 grocery product categories
+- **Training data:** 248 shelf images, 22,731 COCO annotations
+- **Training resolution:** 1280px
+- **Export format:** ONNX (dynamic input, 262 MB)
+- **Inference:** Multi-scale TTA at 640/960/1280px with horizontal flip + WBF fusion
+## Training
+- Pretrained on COCO (YOLOv8x), fine-tuned on competition data
+- Optimizer: AdamW (lr=0.01, weight_decay=0.0005, cosine LR)
+- Augmentation: mosaic, mixup (0.2), copy-paste (0.15), perspective, rotation (±15°)
+- 300 epochs at 1280px, batch=2 on NVIDIA A100 40GB
+- Model soup: weight averaging of epochs 240-290 for better generalization
+## Submission Contents
+ contains:
+-  — YOLOv8x model soup, dynamic input (262 MB)
+-  — YOLO class → COCO category_id mapping
+-  — Multi-scale TTA inference pipeline
+## Usage
+## Sandbox Environment
+- GPU: NVIDIA L4, 24 GB VRAM
+- Runtime: ~113s for test set (300s timeout)
+- Dependencies: onnxruntime-gpu, opencv, numpy, ensemble-boxes
+## Key Learnings
+1. Multi-class YOLO (detect + classify in one step) massively outperformed two-stage (detector + kNN classifier)
+2. Multi-scale TTA gave +0.031 improvement by better detecting small products
+3. Model soup (weight averaging) improves generalization
+4. Higher validation mAP does NOT predict better leaderboard score when training on all data
+5. Dynamic ONNX export required for multi-scale inference
+## License
+MIT