U-Net for Gastrointestinal Polyp Segmentation

Architecture

Standard U-Net with:

Encoder: 4 levels (64, 128, 256, 512 channels), each with two Conv3x3+BN+ReLU blocks followed by MaxPool2x2
Bottleneck: 1024 channels at 16x16 spatial resolution
Decoder: 4 levels mirroring the encoder, using ConvTranspose2d for learned upsampling + skip connections via concatenation
Output: 1x1 Conv producing a single-channel binary mask

Input: 3x256x256 RGB image -> Output: 1x256x256 segmentation mask

Loss Function

BCE + Dice Loss -- Binary Cross-Entropy provides smooth per-pixel gradients, while Dice Loss directly optimizes mask overlap and handles class imbalance (polyps are typically small relative to background).

Training

20 epochs, batch size 8, learning rate 1e-4 (Adam)
Trained on Kvasir-SEG dataset (gastrointestinal polyp segmentation)
Best checkpoint selected by validation loss

Parameters

~31M parameters

Downloads last month: -; Downloads are not tracked for this model. How to track

rmachado23
/

unet-kvasir-seg

U-Net for Gastrointestinal Polyp Segmentation

Architecture

Loss Function

Training

Parameters

Dataset used to train rmachado23/unet-kvasir-seg