punnerud commited on
Commit
991fed2
·
verified ·
1 Parent(s): 0fda9a5

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +65 -3
README.md CHANGED
@@ -1,9 +1,71 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
 
3
  ---
4
 
5
- # NM i AI 2026 - NorgesGruppen Object Detection
6
 
7
- YOLOv8x multi-class detector (356 grocery product categories) with multi-scale TTA.
8
 
9
- Best leaderboard score: **0.9230**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ tags:
4
+ - object-detection
5
+ - yolov8
6
+ - grocery
7
+ - retail
8
+ - onnx
9
+ datasets:
10
+ - custom
11
+ pipeline_tag: object-detection
12
  ---
13
 
14
+ # NM i AI 2026 NorgesGruppen Object Detection
15
 
16
+ Multi-class YOLOv8x detector for 356 grocery product categories on store shelf images.
17
 
18
+ ## Performance
19
+
20
+ | Method | Leaderboard Score |
21
+ |--------|------------------|
22
+ | Multi-scale TTA (640+960+1280 + flip) | **0.9230** |
23
+ | Single inference | 0.8922 |
24
+
25
+ Competition scoring:
26
+
27
+ ## Model Details
28
+
29
+ - **Architecture:** YOLOv8x (68.5M parameters)
30
+ - **Classes:** 356 grocery product categories
31
+ - **Training data:** 248 shelf images, 22,731 COCO annotations
32
+ - **Training resolution:** 1280px
33
+ - **Export format:** ONNX (dynamic input, 262 MB)
34
+ - **Inference:** Multi-scale TTA at 640/960/1280px with horizontal flip + WBF fusion
35
+
36
+ ## Training
37
+
38
+ - Pretrained on COCO (YOLOv8x), fine-tuned on competition data
39
+ - Optimizer: AdamW (lr=0.01, weight_decay=0.0005, cosine LR)
40
+ - Augmentation: mosaic, mixup (0.2), copy-paste (0.15), perspective, rotation (±15°)
41
+ - 300 epochs at 1280px, batch=2 on NVIDIA A100 40GB
42
+ - Model soup: weight averaging of epochs 240-290 for better generalization
43
+
44
+ ## Submission Contents
45
+
46
+ contains:
47
+ - — YOLOv8x model soup, dynamic input (262 MB)
48
+ - — YOLO class → COCO category_id mapping
49
+ - — Multi-scale TTA inference pipeline
50
+
51
+ ## Usage
52
+
53
+
54
+
55
+ ## Sandbox Environment
56
+
57
+ - GPU: NVIDIA L4, 24 GB VRAM
58
+ - Runtime: ~113s for test set (300s timeout)
59
+ - Dependencies: onnxruntime-gpu, opencv, numpy, ensemble-boxes
60
+
61
+ ## Key Learnings
62
+
63
+ 1. Multi-class YOLO (detect + classify in one step) massively outperformed two-stage (detector + kNN classifier)
64
+ 2. Multi-scale TTA gave +0.031 improvement by better detecting small products
65
+ 3. Model soup (weight averaging) improves generalization
66
+ 4. Higher validation mAP does NOT predict better leaderboard score when training on all data
67
+ 5. Dynamic ONNX export required for multi-scale inference
68
+
69
+ ## License
70
+
71
+ MIT