Instructions to use mlx-community/sapiens2-seg-0.4b-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/sapiens2-seg-0.4b-4bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir sapiens2-seg-0.4b-4bit mlx-community/sapiens2-seg-0.4b-4bit
- sapiens2
How to use mlx-community/sapiens2-seg-0.4b-4bit with sapiens2:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
| license: other | |
| license_name: sapiens2-license | |
| license_link: https://github.com/facebookresearch/sapiens2/blob/main/LICENSE.md | |
| library_name: mlx | |
| tags: | |
| - mlx | |
| - sapiens2 | |
| - vision | |
| - human-centric | |
| - seg | |
| pipeline_tag: image-to-image | |
| base_model: | |
| - facebook/sapiens2-seg-0.4b | |
| # mlx-community/sapiens2-seg-0.4b-4bit | |
| MLX port of [`facebook/sapiens2-seg-0.4b`](https://huggingface.co/facebook/sapiens2-seg-0.4b) at **4-bit affine (group_size=64)** precision, converted with [mlx-vlm](https://github.com/Blaizzy/mlx-vlm). | |
| Sapiens2 is a family of human-centric ViTs pretrained on 1B human images. This | |
| repo contains the **seg** head paired with the Sapiens2-0.4b backbone. | |
| ## Install | |
| ```bash | |
| pip install -U mlx-vlm | |
| ``` | |
| ## Usage — body-part segmentation (29 classes) | |
| ```python | |
| from pathlib import Path | |
| from PIL import Image | |
| import numpy as np | |
| from mlx_vlm.utils import load_model | |
| from mlx_vlm.models.sapiens2.processing_sapiens2 import Sapiens2Processor | |
| from mlx_vlm.models.sapiens2.generate import Sapiens2Predictor | |
| model = load_model(Path("mlx-community/sapiens2-seg-0.4b-4bit")) | |
| processor = Sapiens2Processor.from_pretrained("mlx-community/sapiens2-seg-0.4b-4bit") | |
| predictor = Sapiens2Predictor(model, processor) | |
| result = predictor.predict(Image.open("person.jpg")) | |
| # result.mask (orig_h, orig_w) int32 class indices | |
| # result.seg_logits (29, H_out, W_out) raw logits | |
| print("active classes:", np.unique(result.mask).tolist()) | |
| Image.fromarray(result.mask.astype(np.uint8)).save("mask.png") | |
| ``` | |
| Output: dense 29-class body-part segmentation (DOME 29-class scheme — face, | |
| hair, torso, arms/legs split left/right, etc.). | |
| ## Convert your own checkpoint | |
| ```bash | |
| # 1. Stage a float32 MLX directory from the Facebook checkpoint | |
| python -m mlx_vlm.models.sapiens2.convert \ | |
| --hf-repo facebook/sapiens2-seg-0.4b \ | |
| --out ./sapiens2-seg-0.4b-fp32-mlx \ | |
| --dtype float32 | |
| # 2. Quantize + upload via the main mlx_vlm.convert CLI | |
| python -m mlx_vlm.convert \ | |
| --hf-path ./sapiens2-seg-0.4b-fp32-mlx \ | |
| --mlx-path ./sapiens2-seg-0.4b-4bit \ | |
| --quantize --q-bits 4 --q-group-size 64 --q-mode affine \ | |
| --upload-repo mlx-community/sapiens2-seg-0.4b-4bit | |
| ``` | |
| ## Architecture | |
| Sapiens2 backbone: 2-D RoPE ViT (bf16 rope), partial GQA (full MHA in the | |
| first/last 8 blocks, KV-half for the middle), SwiGLU FFN, cls + 8 storage | |
| tokens. Default input: **1024 × 768 (H × W)**, patch size 16, ImageNet | |
| normalization on the [0, 255] scale. | |
| See the [mlx-vlm sapiens2 port](https://github.com/Blaizzy/mlx-vlm/tree/main/mlx_vlm/models/sapiens2) for implementation details. | |
| ## License | |
| Weights released under the [Sapiens2 License](https://github.com/facebookresearch/sapiens2/blob/main/LICENSE.md); this MLX repackaging inherits that license. | |
| ## Citation | |
| ```bibtex | |
| @article{khirodkarsapiens2, | |
| title = {Sapiens2}, | |
| author = {Khirodkar, Rawal and Wen, He and Martinez, Julieta and Dong, Yuan | |
| and Su, Zhaoen and Saito, Shunsuke}, | |
| journal= {arXiv preprint arXiv:2604.21681}, | |
| year = {2026} | |
| } | |
| ``` | |