How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-segmentation", model="rookiexiong/SetCon-8B", trust_remote_code=True)
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("rookiexiong/SetCon-8B", trust_remote_code=True, dtype="auto")
Quick Links

SetCon-8B

SetCon-8B is the model checkpoint for SetCon: Towards Open-Ended Referring Segmentation via Set-Level Concept Prediction.

[๐Ÿ“‚ GitHub] [๐Ÿ“„ Paper]

Usage

Please use this checkpoint together with the official codebase:

git clone https://github.com/rookiexiong7/SetCon.git
cd SetCon
uv sync --extra latest
source .venv/bin/activate

Single-image inference:

python demo.py \
  --image-path assets/room.jpg \
  --query-text "the target objects" \
  --model-path path/to/SetCon-8B

Intended Use

This model is intended for research on open-ended referring image/video segmentation.

Limitations

The model may produce incomplete or inaccurate masks for ambiguous expressions, small objects, crowded scenes, or out-of-domain visual concepts.

Citation

If you find our work helpful for your research, please consider giving a star โญ and citation ๐Ÿ“

@article{zhang2026setcon,
  title={SetCon: towards open-ended referring segmentation via set-level concept prediction},
  author={Zhixiong Zhang and Yizhuo Li and Shuangrui Ding and Yuhang Zang and Shengyuan Ding and Long Xing and Yibin Wang and Qiaosheng Zhang and Jiaqi Wang},
  journal={arXiv preprint arXiv:2605.20110},
  year={2026}
}
Downloads last month
44
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Paper for rookiexiong/SetCon-8B