rookiexiong
/

SetCon-8B

Image Segmentation

feature-extraction

referring-segmentation

video-segmentation

vision-language

Model card Files Files and versions

rookiexiong commited on 1 day ago

Commit

2ebb24a

·

verified ·

1 Parent(s): 7986555

Create README.md

Files changed (1) hide show

README.md +48 -0

README.md ADDED Viewed

	@@ -0,0 +1,48 @@

+---
+license: apache-2.0
+language:
+  - en
+library_name: transformers
+pipeline_tag: image-segmentation
+tags:
+  - referring-segmentation
+  - image-segmentation
+  - video-segmentation
+  - vision-language
+---
+# SetCon-8B
+SetCon-8B is the model checkpoint for **SetCon: Towards Open-Ended Referring Segmentation via Set-Level Concept Prediction**.
+## Usage
+Please use this checkpoint together with the official codebase:
+```bash
+git clone https://github.com/rookiexiong7/SetCon.git
+cd SetCon
+uv sync --extra latest
+source .venv/bin/activate
+```
+Single-image inference:
+```
+python demo.py \
+  --image-path assets/room.jpg \
+  --query-text "the target objects" \
+  --model-path path/to/SetCon-8B
+```
+## Intended Use
+This model is intended for research on open-ended referring image/video segmentation.
+## Limitations
+The model may produce incomplete or inaccurate masks for ambiguous expressions, small objects, crowded scenes, or out-of-domain visual
+concepts.
+## Citation