WeReCooking
/

sapiens2-onnx

Depth Estimation

pose-estimation

normal-estimation

Model card Files Files and versions

sapiens2-onnx / README.md

Nekochu's picture

Initial commit

2d6b460 verified 12 days ago

|

history blame contribute delete

2.43 kB

	---
	license: other
	tags:
	- onnx
	- sapiens2
	- cpu
	- segmentation
	- pose-estimation
	- normal-estimation
	- depth-estimation
	---

	# Sapiens2 ONNX

	CPU-friendly ONNX exports of Meta's `facebook/sapiens2-*`. 15 task heads across 4 tasks and 4 sizes.

	## Folder layout

	Each task has its own folder. Each model is split into a small `.onnx` graph file plus a `.onnx.data` external sidecar (both must live in the same directory at download time).

	\| Task \| 0.4b \| 0.8b \| 1b \| 5b \|
	\|---\|---\|---\|---\|---\|
	\| seg \| `seg/seg_0.4b_fp16.onnx` (777 MB, fp16) \| `seg/seg_0.8b_fp32.onnx` (3.3 GB) \| `seg/seg_1b_fp32.onnx` (5.9 GB) \| `seg/seg_5b_int8.onnx` (5.2 GB) \|
	\| normal \| `normal/normal_0.4b_fp32.onnx` (1.7 GB) \| `normal/normal_0.8b_fp32.onnx` (3.5 GB) \| `normal/normal_1b_fp32.onnx` (6.2 GB) \| `normal/normal_5b_int8.onnx` (6.1 GB) \|
	\| pointmap \| `pointmap/pointmap_0.4b_fp32.onnx` (2.0 GB) \| `pointmap/pointmap_0.8b_fp32.onnx` (3.9 GB) \| `pointmap/pointmap_1b_fp32.onnx` (6.5 GB) \| `pointmap/pointmap_5b_int8.onnx` (6.2 GB) \|
	\| pose \| `pose/pose_0.4b_fp32.onnx` (1.6 GB) \| `pose/pose_0.8b_fp32.onnx` (3.4 GB) \| `pose/pose_1b_fp32.onnx` (6.1 GB) \| not shipped \|

	Cosine similarity vs the PyTorch reference is 0.999 or better on every shipped file.

	## Precision notes

	* seg-0.4b is fp16 (50 percent smaller than fp32 and verified cos 0.99999)
	* 0.4b/0.8b/1b for normal, pointmap, pose are fp32. Naive fp16 cast produces NaN (normal L2-normalize divides near zero) or cos around 0.7 (pointmap metric scale, pose sigmoid heatmaps saturate)
	* 5B variants are INT8 (per-channel symmetric, MatMulIntegerToFloat)
	* pose-5b is not shipped (the int8 quantize attempt did not complete on the available hardware)

	## Inference

	```python
	import numpy as np
	import onnxruntime as ort
	from huggingface_hub import hf_hub_download

	# Download both the .onnx graph and the .onnx.data sidecar side by side
	for fn in ("seg/seg_0.4b_fp16.onnx", "seg/seg_0.4b_fp16.onnx.data"):
	hf_hub_download(repo_id="WeReCooking/sapiens2-onnx", filename=fn, local_dir=".")

	sess = ort.InferenceSession("seg/seg_0.4b_fp16.onnx", providers=["CPUExecutionProvider"])
	# Input expects (N, 3, 1024, 768) fp32 BGR mean-subtracted preprocessed tensor
	out = sess.run(None, {"input": preprocessed})
	```

	For a standalone CLI without sapiens2 or PyTorch, see `app.py onnx ...` in the source Space `WeReCooking/sapiens2-cpu`.

	## License

	Same as upstream `facebook/sapiens2-*`.