CLIPSeg β Fine-tuned for Drywall QA
Fine-tuned version of CIDAS/clipseg-rd64-refined
for text-conditioned binary segmentation of drywall defects.
Supported Prompts
| Prompt |
Target Region |
Val mIoU |
Val Dice |
segment crack |
Wall cracks |
0.7352 |
0.8336 |
segment taping area |
Joint / tape seam |
0.4985 |
0.6256 |
Training Details
| Setting |
Value |
| Base model |
CIDAS/clipseg-rd64-refined |
| Epochs |
20 |
| Batch size |
4 |
| Learning rate |
1e-4 (AdamW) |
| Scheduler |
CosineAnnealingLR |
| Loss |
BCE 0.5 + Dice 0.5 |
| Image size |
352 Γ 352 |
| Threshold |
0.5 |
| Seed |
42 |
| Hardware |
Tesla T4 (Google Colab) |
| Train time |
~65.3 min |
| Avg inference |
13.0 ms / image |
Datasets
Quick Usage
import torch
from PIL import Image
from transformers import CLIPSegProcessor, CLIPSegForImageSegmentation
processor = CLIPSegProcessor.from_pretrained("S-4-G-4-R/clipseg-drywall-qa")
model = CLIPSegForImageSegmentation.from_pretrained("S-4-G-4-R/clipseg-drywall-qa")
model.eval()
image = Image.open("your_image.jpg").convert("RGB")
prompt = "segment crack"
inputs = processor(
text=prompt, images=image,
return_tensors="pt", padding=True
)
with torch.no_grad():
logits = model(**inputs).logits
mask = (torch.sigmoid(logits[0]) > 0.5).numpy()
Test Results (best checkpoint β epoch 15)
| Metric |
segment crack |
segment taping area |
| mIoU |
0.6900 (test) / 0.7352 (val) |
0.4985 (val) |
| Dice |
0.7957 (test) / 0.8336 (val) |
0.6256 (val) |