Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,86 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
library_name: onnx
|
| 4 |
+
tags:
|
| 5 |
+
- face-detection
|
| 6 |
+
- face-landmark
|
| 7 |
+
- mediapipe
|
| 8 |
+
- blazeface
|
| 9 |
+
- onnx
|
| 10 |
+
base_model: google/mediapipe
|
| 11 |
+
pipeline_tag: object-detection
|
| 12 |
---
|
| 13 |
+
|
| 14 |
+
# MediaPipe Face β Detection + 6-point Landmarks (ONNX)
|
| 15 |
+
|
| 16 |
+
Commercial-clean face pipeline bundling both precision variants of Google's MediaPipe Face Detection + Face Landmark models. Apache-2.0 throughout the entire chain: Google MediaPipe β zmurez/MediaPipePyTorch port β Qualcomm AI Hub ONNX export.
|
| 17 |
+
|
| 18 |
+
Two-stage pipeline:
|
| 19 |
+
|
| 20 |
+
1. **Face detector** (BlazeFace-derived) β finds face bounding boxes
|
| 21 |
+
2. **Face landmark detector** β for each detected face, returns 6 keypoints (left eye, right eye, nose tip, mouth, left eye tragion, right eye tragion)
|
| 22 |
+
|
| 23 |
+
Re-hosted under Heliosoph as a single bundled repo for convenience β float + int8 variants live in separate subfolders.
|
| 24 |
+
|
| 25 |
+
Credit: Google MediaPipe team (original models), Zak Murez (PyTorch port), Qualcomm AI Hub (ONNX export).
|
| 26 |
+
|
| 27 |
+
## What this repo contains
|
| 28 |
+
|
| 29 |
+
```
|
| 30 |
+
float/ # fp32 β recommended default
|
| 31 |
+
face_detector.onnx # 78 KB graph
|
| 32 |
+
face_detector.data # 517 KB external weights
|
| 33 |
+
face_landmark_detector.onnx # 58 KB graph
|
| 34 |
+
face_landmark_detector.data # 2.4 MB external weights
|
| 35 |
+
metadata.json
|
| 36 |
+
int8/ # W8A8 β quantized, smaller/faster
|
| 37 |
+
face_detector.onnx
|
| 38 |
+
face_detector.data
|
| 39 |
+
face_landmark_detector.onnx
|
| 40 |
+
face_landmark_detector.data
|
| 41 |
+
metadata.json
|
| 42 |
+
LICENSE
|
| 43 |
+
README.md
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
**Important: external weights pattern.** Each `.onnx` file is paired with a `.data` file holding the actual tensor weights. Both files must be in the same directory at load time β ONNX Runtime resolves the `.data` file by relative path from the `.onnx`.
|
| 47 |
+
|
| 48 |
+
## How to use
|
| 49 |
+
|
| 50 |
+
```python
|
| 51 |
+
import onnxruntime as ort
|
| 52 |
+
import numpy as np
|
| 53 |
+
|
| 54 |
+
# Stage 1: detect faces
|
| 55 |
+
detector = ort.InferenceSession("float/face_detector.onnx")
|
| 56 |
+
# Input: 128Γ128 RGB, normalized to [-1, 1]
|
| 57 |
+
detections = detector.run(None, {"image": preprocessed_128x128_image})
|
| 58 |
+
|
| 59 |
+
# Stage 2: landmark each detected face
|
| 60 |
+
landmarker = ort.InferenceSession("float/face_landmark_detector.onnx")
|
| 61 |
+
# Input: 192Γ192 RGB crop around each detected face
|
| 62 |
+
landmarks = landmarker.run(None, {"image": face_crop_192x192})
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
Reference preprocessing + decoding: [zmurez/MediaPipePyTorch](https://github.com/zmurez/MediaPipePyTorch) has the canonical Python implementation.
|
| 66 |
+
|
| 67 |
+
## float vs int8 β which to pick
|
| 68 |
+
|
| 69 |
+
| Variant | Size | Best for |
|
| 70 |
+
|---|---|---|
|
| 71 |
+
| `float/` (this default) | ~3 MB | GPU, max accuracy. Recommended general default. |
|
| 72 |
+
| `int8/` | ~1.5 MB | CPU, NPU (OpenVINO), mobile. Some accuracy loss on small/distant faces; near-identical on close portraits. |
|
| 73 |
+
|
| 74 |
+
Catalog entries: `mediapipe-face` (float) and `mediapipe-face-int8` (int8). Both reference this single repo with different `include` patterns.
|
| 75 |
+
|
| 76 |
+
## Why MediaPipe over alternatives
|
| 77 |
+
|
| 78 |
+
- **vs InsightFace SCRFD** β SCRFD's released weights are non-commercial-research-only (WIDER FACE dataset terms). MediaPipe was trained by Google on commercial-friendly data and released under permissive terms.
|
| 79 |
+
- **vs YuNet** β YuNet is technically also encumbered by WIDER FACE; upstream just doesn't surface that. MediaPipe is unambiguous.
|
| 80 |
+
- **vs YOLOv8-Face** β Ultralytics AGPL-3.0. MediaPipe is Apache-2.0.
|
| 81 |
+
|
| 82 |
+
If you need higher accuracy on small faces in dense scenes (crowd photos, surveillance angles), MediaPipe will underperform RetinaFace-class detectors. For the common case (close portraits, video conferencing, photo tagging), MediaPipe is the right default.
|
| 83 |
+
|
| 84 |
+
## License
|
| 85 |
+
|
| 86 |
+
**Apache-2.0** β same as upstream (Google MediaPipe). `LICENSE` file included; chain of attribution is documented above.
|