road-lane-semantic-segmentation-unet-resnet50

🛣️ Road & Lane Semantic Segmentation (U-Net with ResNet-50 Encoder)

Model Description

This model performs multi-class semantic segmentation for driving scenes, focusing on:

Background
Drivable road area
Lane markings

It is designed as a perception module similar to those used in ADAS and autonomous driving pipelines, where structured lane visualization is derived directly from segmentation outputs (no classical lane detection or Hough transform is used).

The model follows a U-Net–style encoder–decoder architecture with a ResNet-50 backbone pre-trained for feature extraction.

Architecture

Encoder: ResNet-50 (pre-trained)
Decoder: U-Net–style upsampling path with skip connections
Output: Softmax over 3 classes
- 0: Background
- 1: Road
- 2: Lane markings

Intended Use

This model can be used for:

Academic research in road scene understanding
ADAS perception experiments
Lane visualization systems based on segmentation
Educational projects in computer vision and deep learning

Limitations

Not intended for real-world autonomous driving deployment
Performance may degrade under:
- Night conditions
- Heavy rain or fog
- Unusual camera angles
Fine-tuning is recommended for different countries, road types, or camera setups

Repository

Full training, fine-tuning, and inference pipelines are available here

Training Data

The model was trained on a custom enhanced dataset, originally based on:

Semantic Segmentation Makassar (IDN) Road Dataset
(~374 labeled images)

Dataset Enhancements

To improve generalization, the dataset was expanded using:

Random rotations
Horizontal flipping
Brightness and contrast jitter

Final dataset size: ~1,496 images

The dataset is publicly available and not included in this repository due to size constraints.

Training Procedure

Loss Function: Sparse Categorical Crossentropy
Optimizer: Adam
Learning Rate: 1e-4
Epochs: 20
Metrics:
- Sparse Categorical Accuracy
- Mean IoU
- Lane-class IoU

Multiple experiments were conducted with different augmentation strategies and training schedules.
Final model selection prioritized lane IoU stability and visual consistency, not only numerical metrics.

Class Mapping

Class ID	Label
0	Background
1	Road
2	Lane Markings

Evaluation Results

Metric	Training	Validation
Accuracy	0.9972	0.9954
Mean IoU	0.9456	0.9401
Lane IoU	0.8517	0.8542
Loss	0.0069	0.0146

How to Use

This model is intended to be used with custom inference pipelines.

Typical inference steps:

Resize input image to model input size
Normalize pixel values
Run forward pass
Apply argmax over softmax output to get class IDs
Visualize lane pixels or overlay segmentation mask

Example (TensorFlow / Keras)

import tensorflow as tf
import cv2
import numpy as np

# -------- Load model --------
model = tf.keras.models.load_model(
    "model_path",
    compile=False
)

# -------- Load & preprocess image --------
img_path = "image_path"

orig = cv2.imread(img_path)
orig = cv2.cvtColor(orig, cv2.COLOR_BGR2RGB)

img = cv2.resize(orig, (256, 256))
img_norm = img / 255.0
img_input = np.expand_dims(img_norm, axis=0)

# -------- Predict --------
pred = model.predict(img_input)
mask = np.argmax(pred[0], axis=-1)  # (256, 256)

# -------- Create color mask --------
# Class mapping:
# 0 = background, 1 = road, 2 = lane

color_mask = np.zeros((256, 256, 3), dtype=np.uint8)

color_mask[mask == 1] = (255, 0, 0)   # Road -> Red
color_mask[mask == 2] = (0, 255, 0)   # Lane -> Green

# -------- Overlay full segmentation --------
overlay_full = cv2.addWeighted(img.astype(np.uint8), 0.6, color_mask, 0.4, 0)

# -------- Lane-only overlay --------
lane_mask = np.zeros_like(color_mask)
lane_mask[mask == 2] = (0, 255, 0)

overlay_lane = cv2.addWeighted(img.astype(np.uint8), 0.7, lane_mask, 0.3, 0)

# -------- Show results --------
cv2.imshow("Original", cv2.cvtColor(img.astype(np.uint8), cv2.COLOR_RGB2BGR))
cv2.imshow("Segmentation Overlay (Road + Lane)", cv2.cvtColor(overlay_full, cv2.COLOR_RGB2BGR))
cv2.imshow("Lane Only Overlay", cv2.cvtColor(overlay_lane, cv2.COLOR_RGB2BGR))

cv2.waitKey(0)
cv2.destroyAllWindows()

For full pipelines including video processing and lane visualization,
see the GitHub repository

Fine-Tuning

Fine-tuning is recommended if:

Using different road environments
Working with different camera perspectives
Wanting to rebalance lane vs road classes

Training scripts support:

Freezing the encoder
Full model retraining
Custom datasets with RGB masks converted to class IDs

Ethical Considerations

This model is intended for research and educational purposes only.

It should not be used as a sole perception system in safety-critical or real-world autonomous driving applications.

License

This model is released under the Apache 2.0 License, allowing commercial and research use with attribution.

Author

Developed by Yara Elshehawi
Check out my other work: Portfolio LinkedIn

Downloads last month: 10