SetCon-8B / README.md
rookiexiong's picture
Update README.md
e08da01 verified
---
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: image-segmentation
tags:
- referring-segmentation
- image-segmentation
- video-segmentation
- vision-language
---
# SetCon-8B
SetCon-8B is the model checkpoint for **SetCon: Towards Open-Ended Referring Segmentation via Set-Level Concept Prediction**.
[\[πŸ“‚ GitHub\]](https://github.com/rookiexiong7/SetCon)
[\[πŸ“„ Paper\]](https://arxiv.org/abs/2605.20110)
## Usage
Please use this checkpoint together with the official codebase:
```bash
git clone https://github.com/rookiexiong7/SetCon.git
cd SetCon
uv sync --extra latest
source .venv/bin/activate
```
Single-image inference:
```
python demo.py \
--image-path assets/room.jpg \
--query-text "the target objects" \
--model-path path/to/SetCon-8B
```
## Intended Use
This model is intended for research on open-ended referring image/video segmentation.
## Limitations
The model may produce incomplete or inaccurate masks for ambiguous expressions, small objects, crowded scenes, or out-of-domain visual
concepts.
## Citation
If you find our work helpful for your research, please consider giving a star ⭐ and citation πŸ“
```bibtex
@article{zhang2026setcon,
title={SetCon: towards open-ended referring segmentation via set-level concept prediction},
author={Zhixiong Zhang and Yizhuo Li and Shuangrui Ding and Yuhang Zang and Shengyuan Ding and Long Xing and Yibin Wang and Qiaosheng Zhang and Jiaqi Wang},
journal={arXiv preprint arXiv:2605.20110},
year={2026}
}
```