flyingbertman commited on
Commit
eafa15a
Β·
verified Β·
1 Parent(s): b7c4af3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -0
README.md CHANGED
@@ -1,3 +1,86 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ library_name: onnx
4
+ tags:
5
+ - face-detection
6
+ - face-landmark
7
+ - mediapipe
8
+ - blazeface
9
+ - onnx
10
+ base_model: google/mediapipe
11
+ pipeline_tag: object-detection
12
  ---
13
+
14
+ # MediaPipe Face β€” Detection + 6-point Landmarks (ONNX)
15
+
16
+ Commercial-clean face pipeline bundling both precision variants of Google's MediaPipe Face Detection + Face Landmark models. Apache-2.0 throughout the entire chain: Google MediaPipe β†’ zmurez/MediaPipePyTorch port β†’ Qualcomm AI Hub ONNX export.
17
+
18
+ Two-stage pipeline:
19
+
20
+ 1. **Face detector** (BlazeFace-derived) β€” finds face bounding boxes
21
+ 2. **Face landmark detector** β€” for each detected face, returns 6 keypoints (left eye, right eye, nose tip, mouth, left eye tragion, right eye tragion)
22
+
23
+ Re-hosted under Heliosoph as a single bundled repo for convenience β€” float + int8 variants live in separate subfolders.
24
+
25
+ Credit: Google MediaPipe team (original models), Zak Murez (PyTorch port), Qualcomm AI Hub (ONNX export).
26
+
27
+ ## What this repo contains
28
+
29
+ ```
30
+ float/ # fp32 β€” recommended default
31
+ face_detector.onnx # 78 KB graph
32
+ face_detector.data # 517 KB external weights
33
+ face_landmark_detector.onnx # 58 KB graph
34
+ face_landmark_detector.data # 2.4 MB external weights
35
+ metadata.json
36
+ int8/ # W8A8 β€” quantized, smaller/faster
37
+ face_detector.onnx
38
+ face_detector.data
39
+ face_landmark_detector.onnx
40
+ face_landmark_detector.data
41
+ metadata.json
42
+ LICENSE
43
+ README.md
44
+ ```
45
+
46
+ **Important: external weights pattern.** Each `.onnx` file is paired with a `.data` file holding the actual tensor weights. Both files must be in the same directory at load time β€” ONNX Runtime resolves the `.data` file by relative path from the `.onnx`.
47
+
48
+ ## How to use
49
+
50
+ ```python
51
+ import onnxruntime as ort
52
+ import numpy as np
53
+
54
+ # Stage 1: detect faces
55
+ detector = ort.InferenceSession("float/face_detector.onnx")
56
+ # Input: 128Γ—128 RGB, normalized to [-1, 1]
57
+ detections = detector.run(None, {"image": preprocessed_128x128_image})
58
+
59
+ # Stage 2: landmark each detected face
60
+ landmarker = ort.InferenceSession("float/face_landmark_detector.onnx")
61
+ # Input: 192Γ—192 RGB crop around each detected face
62
+ landmarks = landmarker.run(None, {"image": face_crop_192x192})
63
+ ```
64
+
65
+ Reference preprocessing + decoding: [zmurez/MediaPipePyTorch](https://github.com/zmurez/MediaPipePyTorch) has the canonical Python implementation.
66
+
67
+ ## float vs int8 β€” which to pick
68
+
69
+ | Variant | Size | Best for |
70
+ |---|---|---|
71
+ | `float/` (this default) | ~3 MB | GPU, max accuracy. Recommended general default. |
72
+ | `int8/` | ~1.5 MB | CPU, NPU (OpenVINO), mobile. Some accuracy loss on small/distant faces; near-identical on close portraits. |
73
+
74
+ Catalog entries: `mediapipe-face` (float) and `mediapipe-face-int8` (int8). Both reference this single repo with different `include` patterns.
75
+
76
+ ## Why MediaPipe over alternatives
77
+
78
+ - **vs InsightFace SCRFD** β€” SCRFD's released weights are non-commercial-research-only (WIDER FACE dataset terms). MediaPipe was trained by Google on commercial-friendly data and released under permissive terms.
79
+ - **vs YuNet** β€” YuNet is technically also encumbered by WIDER FACE; upstream just doesn't surface that. MediaPipe is unambiguous.
80
+ - **vs YOLOv8-Face** β€” Ultralytics AGPL-3.0. MediaPipe is Apache-2.0.
81
+
82
+ If you need higher accuracy on small faces in dense scenes (crowd photos, surveillance angles), MediaPipe will underperform RetinaFace-class detectors. For the common case (close portraits, video conferencing, photo tagging), MediaPipe is the right default.
83
+
84
+ ## License
85
+
86
+ **Apache-2.0** β€” same as upstream (Google MediaPipe). `LICENSE` file included; chain of attribution is documented above.