mlx-community/sapiens2-seg-0.4b-6bit
MLX port of facebook/sapiens2-seg-0.4b at 6-bit affine (group_size=64) precision, converted with mlx-vlm.
Sapiens2 is a family of human-centric ViTs pretrained on 1B human images. This repo contains the seg head paired with the Sapiens2-0.4b backbone.
Install
pip install -U mlx-vlm
Usage — body-part segmentation (29 classes)
from pathlib import Path
from PIL import Image
import numpy as np
from mlx_vlm.utils import load_model
from mlx_vlm.models.sapiens2.processing_sapiens2 import Sapiens2Processor
from mlx_vlm.models.sapiens2.generate import Sapiens2Predictor
model = load_model(Path("mlx-community/sapiens2-seg-0.4b-6bit"))
processor = Sapiens2Processor.from_pretrained("mlx-community/sapiens2-seg-0.4b-6bit")
predictor = Sapiens2Predictor(model, processor)
result = predictor.predict(Image.open("person.jpg"))
# result.mask (orig_h, orig_w) int32 class indices
# result.seg_logits (29, H_out, W_out) raw logits
print("active classes:", np.unique(result.mask).tolist())
Image.fromarray(result.mask.astype(np.uint8)).save("mask.png")
Output: dense 29-class body-part segmentation (DOME 29-class scheme — face, hair, torso, arms/legs split left/right, etc.).
Convert your own checkpoint
# 1. Stage a float32 MLX directory from the Facebook checkpoint
python -m mlx_vlm.models.sapiens2.convert \
--hf-repo facebook/sapiens2-seg-0.4b \
--out ./sapiens2-seg-0.4b-fp32-mlx \
--dtype float32
# 2. Quantize + upload via the main mlx_vlm.convert CLI
python -m mlx_vlm.convert \
--hf-path ./sapiens2-seg-0.4b-fp32-mlx \
--mlx-path ./sapiens2-seg-0.4b-6bit \
--quantize --q-bits 6 --q-group-size 64 --q-mode affine \
--upload-repo mlx-community/sapiens2-seg-0.4b-6bit
Architecture
Sapiens2 backbone: 2-D RoPE ViT (bf16 rope), partial GQA (full MHA in the first/last 8 blocks, KV-half for the middle), SwiGLU FFN, cls + 8 storage tokens. Default input: 1024 × 768 (H × W), patch size 16, ImageNet normalization on the [0, 255] scale.
See the mlx-vlm sapiens2 port for implementation details.
License
Weights released under the Sapiens2 License; this MLX repackaging inherits that license.
Citation
@article{khirodkarsapiens2,
title = {Sapiens2},
author = {Khirodkar, Rawal and Wen, He and Martinez, Julieta and Dong, Yuan
and Su, Zhaoen and Saito, Shunsuke},
journal= {arXiv preprint arXiv:2604.21681},
year = {2026}
}
- Downloads last month
- 26
6-bit
Model tree for mlx-community/sapiens2-seg-0.4b-6bit
Base model
facebook/sapiens2-pretrain-0.4b