Depth Estimation
sapiens
sapiens2
human-centric
normal
rawalkhirodkar commited on
Commit
05e6313
·
verified ·
1 Parent(s): 15cbd76

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: sapiens2-license
4
+ license_link: https://github.com/facebookresearch/sapiens2/blob/main/LICENSE.md
5
+ pipeline_tag: depth-estimation
6
+ library_name: sapiens
7
+ base_model: facebook/sapiens2-pretrain-0.4b
8
+ tags:
9
+ - sapiens
10
+ - sapiens2
11
+ - human-centric
12
+ - normal
13
+ ---
14
+
15
+ # Sapiens2-0.4B-Surface
16
+
17
+ Per-pixel surface-normal estimation (3-channel unit vectors in camera frame).
18
+
19
+ This repository contains the **0.4B Surface Normal Estimation** checkpoint, finetuned from the [Sapiens2-0.4B pretrained backbone](https://huggingface.co/facebook/sapiens2-pretrain-0.4b).
20
+
21
+ - 📄 **Paper:** [OpenReview (ICLR 2026)](https://openreview.net/pdf?id=IVAlYCqdvW)
22
+ - 🌐 **Project Page:** [rawalkhirodkar.github.io/sapiens2](https://rawalkhirodkar.github.io/sapiens2)
23
+ - 💻 **Code:** [github.com/facebookresearch/sapiens2](https://github.com/facebookresearch/sapiens2)
24
+
25
+ ## Model Details
26
+
27
+ - **Developed by:** Meta
28
+ - **Model type:** Vision Transformer + Surface Normal Estimation head
29
+ - **License:** [Sapiens2 License](https://github.com/facebookresearch/sapiens2/blob/main/LICENSE.md)
30
+ - **Task:** normal
31
+ - **Base model:** [facebook/sapiens2-pretrain-0.4b](https://huggingface.co/facebook/sapiens2-pretrain-0.4b)
32
+ - **Format:** safetensors
33
+ - **File:** `sapiens2_0.4b_normal.safetensors`
34
+
35
+ ## Quick Start
36
+
37
+ Install the [Sapiens2 repo](https://github.com/facebookresearch/sapiens2) (`pip install -e .`), download the checkpoint, and run the demo:
38
+
39
+ ```bash
40
+ # 1. Download the checkpoint to $SAPIENS_CHECKPOINT_ROOT/normal/
41
+ hf download facebook/sapiens2-normal-0.4b sapiens2_0.4b_normal.safetensors \
42
+ --local-dir ~/sapiens2_host/normal
43
+
44
+ # 2. Run the demo (edit INPUT, OUTPUT, and MODEL_NAME inside the script)
45
+ cd $SAPIENS_ROOT/sapiens/dense
46
+ ./scripts/demo/normal.sh
47
+ ```
48
+
49
+ See the [Surface Normal Estimation guide](https://github.com/facebookresearch/sapiens2/blob/main/docs/NORMAL.md) for details on inputs, outputs, and visualization options.
50
+
51
+ ## Model Card
52
+
53
+ | Field | Value |
54
+ |-------|-------|
55
+ | Architecture | Sapiens2 ViT backbone + Surface Normal Estimation head |
56
+ | Backbone parameters | 0.398 B |
57
+ | Backbone FLOPs | 1.260 T |
58
+ | Embedding dim | 1024 |
59
+ | Layers | 24 |
60
+ | Attention heads | 16 |
61
+ | Inference resolution | 1024 × 768 (H × W) |
62
+ | Patch size | 16 |
63
+
64
+ ### Sapiens2-Surface Family
65
+
66
+ | Model | Params | FLOPs | Embed dim | Layers | Heads |
67
+ |-------|--------|-------|-----------|--------|-------|
68
+ | **Sapiens2-0.4B** *(this)* | 0.398 B | 1.260 T | 1024 | 24 | 16 |
69
+ | [Sapiens2-0.8B](https://huggingface.co/facebook/sapiens2-normal-0.8b) | 0.818 B | 2.592 T | 1280 | 32 | 16 |
70
+ | [Sapiens2-1B](https://huggingface.co/facebook/sapiens2-normal-1b) | 1.462 B | 4.715 T | 1536 | 40 | 24 |
71
+ | [Sapiens2-5B](https://huggingface.co/facebook/sapiens2-normal-5b) | 5.071 B | 15.722 T | 2432 | 56 | 32 |
72
+
73
+ See the [Sapiens2 Collection](https://huggingface.co/collections/facebook/sapiens2) for all variants and other downstream task checkpoints.
74
+
75
+ ## Intended Use
76
+
77
+ - Surface Normal Estimation on human-centric imagery
78
+ - Research on human-centric vision
79
+
80
+ ## License
81
+
82
+ Released under the [Sapiens2 License](https://github.com/facebookresearch/sapiens2/blob/main/LICENSE.md).
83
+
84
+ ## Citation
85
+
86
+ ```bibtex
87
+ @inproceedings{khirodkar2026sapiens2,
88
+ title={Sapiens2},
89
+ author={Khirodkar, Rawal and Wen, He and Martinez, Julieta and Dong, Yuan and Zhaoen, Su and Saito, Shunsuke},
90
+ booktitle={International Conference on Learning Representations (ICLR)},
91
+ year={2026}
92
+ }
93
+ ```