XPathology — Colon Specialist (EfficientNetB1)

Part of the X-Pathology AI-Assisted Oncology Screening Platform
_{Developed by Muhammad Hassan · AgenticEra Systems}

Overview

XPathology Colon Specialist is a fine-tuned EfficientNetB1 CNN trained to classify colorectal histopathology patches into 9 distinct tissue types from H&E-stained whole-slide images. It is the core classification backbone of the X-Pathology platform, which combines this model with Grad-CAM explainability and Gemini-powered dual clinical/patient reporting.

This model was trained on the NCT-CRC-HE-100K dataset and independently validated on the official CRC-VAL-HE-7K holdout set — a completely separate dataset scanned on different equipment, confirming genuine generalisation across scanner and staining variation.

Disclaimer: This model is intended for research and educational use only. It is not FDA-approved or CE-marked diagnostic software. All outputs must be verified by a licensed pathologist before any clinical decision-making.

Model Details

Property	Value
Model name	xpathology_colon_specialist_b1
Architecture	EfficientNetB1 (ImageNet pretrained)
Framework	TensorFlow / Keras
Input size	240 × 240 × 3 (RGB, pixel values 0–255)
Output	9-class softmax
Training dataset	NCT-CRC-HE-100K (100,000 patches)
Validation dataset	CRC-VAL-HE-7K (7,180 patches, external holdout)
Temperature (T)	0.5576 (post-hoc calibration applied)
Mixed precision	Yes (float16 compute, float32 weights)
Training hardware	Kaggle T4 × 2 (MirroredStrategy)

Tissue Classes

Index	Class	Description
0	ADI	Adipose tissue
1	BACK	Background / empty slide
2	DEB	Debris / necrosis
3	LYM	Lymphocytes
4	MUC	Mucus
5	MUS	Smooth muscle
6	NORM	Normal colon mucosa
7	STR	Cancer-associated stroma
8	TUM	Colorectal adenocarcinoma epithelium

Performance

Internal Validation (20% split of NCT-CRC-HE-100K)

Metric	Value
Accuracy	99.12%
AUC	0.9998
Top-2 Accuracy	99.90%
Val Loss	0.5086

External Holdout Validation (CRC-VAL-HE-7K)

Tested on a completely separate dataset from a different scanning source, never seen during training or validation.

Metric	Value
Accuracy	92.73%
AUC	0.9922
Macro F1	0.909
Weighted F1	0.929

Per-Class Performance (Holdout)

Class	Precision	Recall	F1	Support
ADI	0.9983	0.8961	0.9445	1,338
BACK	0.9988	1.0000	0.9994	847
DEB	0.7967	0.9941	0.8845	339
LYM	0.9983	0.9543	0.9758	634
MUC	0.9752	0.9517	0.9633	1,035
MUS	0.6940	0.8429	0.7613	592
NORM	0.9228	0.9838	0.9523	741
STR	0.7452	0.7363	0.7407	421
TUM	0.9829	0.9303	0.9558	1,233

Note on MUS/STR: Smooth muscle and cancer-associated stroma are histologically similar under H&E at patch level — this is a known hard pair in colorectal CPath literature. Crucially, both are non-neoplastic tissue types, so MUS↔STR confusion carries no clinical consequence for the primary cancer/non-cancer determination. The critical TUM class achieves F1 = 0.9558 on the external holdout.

Training Configuration

Architecture

Input (240×240×3)
  └── EfficientNetB1 backbone (ImageNet weights)
        └── GlobalAveragePooling2D
              └── BatchNormalization
                    └── Dropout(0.4)
                          └── Dense(256, relu)
                                └── BatchNormalization
                                      └── Dropout(0.3)
                                            └── Dense(9, softmax, dtype=float32)

Two-Phase Training

Phase 1 — Warm-up (15 epochs)

Backbone fully frozen
Adam optimizer, LR = 1e-3
CategoricalCrossentropy with label smoothing = 0.1

Phase 2 — Fine-tuning (40 epochs, early stopping)

Top 32% of backbone layers unfrozen
All BatchNorm layers remain frozen (critical for stability)
Adam optimizer, LR = 1e-5, gradient clipping (clipnorm=1.0)
EarlyStopping (patience=6), ReduceLROnPlateau (factor=0.3, patience=3)
Best checkpoint saved on val_loss

Data Augmentation (CPU-side via tf.data)

RandomFlip (horizontal + vertical)
RandomRotation (90° increments)
RandomBrightness (±0.2)
RandomContrast (0.8–1.2)
RandomSaturation (0.8–1.2)
RandomHue (±0.05)
RandomCrop (90–100% of image, resized back)

Augmentations were applied on CPU within the tf.data pipeline (not as Keras layers), keeping VRAM usage efficient for large-scale training.

Confidence Calibration

Post-training temperature scaling was applied using the bounded scalar optimisation of NLL on the validation set:

Optimal temperature T = 0.5576
NLL improved after calibration

T < 1 indicates the model's raw softmax outputs were more diffuse than the ground truth distribution — consistent with label smoothing during training, which deliberately softens target distributions.

Usage

Installation

pip install tensorflow>=2.12 numpy

Load and Infer

import numpy as np
import tensorflow as tf
import json

# Load model and calibration assets
model   = tf.keras.models.load_model('xpathology_colon_specialist_b1.keras')
T       = np.load('temperature_value.npy')[0]           # 0.5576
classes = json.load(open('class_names.json'))

# Build logit extractor (pre-softmax)
logit_model = tf.keras.Model(
    inputs  = model.input,
    outputs = model.get_layer('intermediate_dropout').output
)
final_dense   = model.get_layer('colon_specialist_output')
weights, bias = final_dense.get_weights()

def predict(image_array):
    """
    Args:
        image_array: np.ndarray of shape (240, 240, 3), dtype float32, range [0, 255]
    Returns:
        predicted_class (str), confidence (float), all_probs (dict)
    """
    img      = tf.expand_dims(image_array, axis=0)
    features = logit_model(img, training=False).numpy()
    logits   = features @ weights + bias             # (1, 9) — pre-softmax
    scaled   = logits / T                            # temperature scaling
    exp_l    = np.exp(scaled - scaled.max())
    probs    = (exp_l / exp_l.sum()).flatten()

    pred_idx    = int(np.argmax(probs))
    pred_class  = classes[pred_idx]
    confidence  = float(probs[pred_idx])
    all_probs   = dict(zip(classes, probs.tolist()))

    return pred_class, confidence, all_probs


# Example
import cv2
img = cv2.imread('patch.png')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (240, 240)).astype(np.float32)

label, conf, probs = predict(img)
print(f"Prediction : {label}")
print(f"Confidence : {conf:.4f}")
print(f"All probs  : {probs}")

Input Requirements

Property	Requirement
Format	H&E stained histopathology patch
Resolution	224×224 or larger (will be resized to 240×240)
Color space	RGB
Pixel range	0–255 (do NOT normalize to [0,1])
File formats	PNG, JPEG, TIFF (convert TIFF to PNG first)

Important: EfficientNetB1 performs its own internal normalization. Do not apply rescale=1./255 or any manual normalization before passing the image to the model.

Repository Structure

xpathology-colon-specialist/
├── xpathology_colon_specialist_b1.keras   # Trained model weights
├── temperature_value.npy                  # Calibration temperature T
├── class_names.json                       # Ordered list of 9 class labels
├── training_summary.json                  # Full training metadata
├── training_log.csv                       # Epoch-by-epoch metrics (55 epochs)
└── README.md

Dataset Information

Dataset	Purpose	Images	Classes
NCT-CRC-HE-100K	Training + validation split	100,000	9
CRC-VAL-HE-7K	External holdout evaluation	7,180	9

Both datasets were originally published by Kather et al. and are widely used benchmarks in computational pathology. Images are 224×224 pixel patches at 0.5 µm/pixel (20× magnification), normalised with Macenko stain normalisation in the original release.

Reference:

Kather JN, Halama N, Marx A. 100,000 histological images of human colorectal cancer and healthy tissue. Zenodo. 2018. https://doi.org/10.5281/zenodo.1214456

Limitations

Single organ scope: This model is trained exclusively on colorectal tissue. Do not use it for lung, breast, prostate, or any other organ.
Patch-level only: The model classifies individual 240×240 patches. Slide-level diagnosis requires aggregating predictions across many patches.
Scanner variability: Performance may degrade on slides from scanners with significantly different colour profiles than the training distribution. The 6.4% accuracy gap between internal validation (99.1%) and external holdout (92.7%) reflects normal cross-scanner domain shift.
MUS/STR ambiguity: As noted above, smooth muscle and stromal tissue confusion is an inherent limitation of patch-level classification without spatial context.
Not for clinical use: This model has not undergone regulatory review and must not be used as the sole basis for any clinical or diagnostic decision.

X-Pathology Platform

This model is deployed as part of X-Pathology, an AI-assisted oncology screening tool built as an educational and research portfolio project.

Pipeline:

User uploads H&E histopathology patch
EfficientNetB1 classifies tissue type (this model)
Grad-CAM generates attention heatmap
Gemini 2.5 Flash writes a dual report — clinical summary + plain-English patient version

Live demo: x-pathology.vercel.app
GitHub: github.com/Muhammad-Hassan12

Training Reproducibility

Full training code is available in the X-Pathology GitHub repository. Key environment details:

Python          3.12
TensorFlow      2.x
Keras           3.x
CUDA            12.x
Hardware        Kaggle T4 × 2 (MirroredStrategy)
Mixed precision float16 (mixed_float16 policy)
Global batch    64 (32 per replica × 2 GPUs)

License

This model is released under the Apache 2.0 License.

The training datasets (NCT-CRC-HE-100K and CRC-VAL-HE-7K) are released under CC BY 4.0 by Kather et al. Please cite the original dataset authors if you use this model in published work.

Citation

If you use this model in your research or build upon it, please cite:

@misc{hassan2026xpathology,
  author       = {Syed Muhammad Hassan},
  title        = {XPathology Colon Specialist: EfficientNetB1 for 9-Class Colorectal Histopathology},
  year         = {2026},
  publisher    = {Hugging Face},
  url          = {https://huggingface.co/rarfileexe/Xpathology-Colon-Specialist}
}

Built with care by Muhammad Hassan · AgenticEra Systems
_{BSCS Student · Certified GenAI Developer · Computational Pathology Researcher}

Downloads last month: 63

rarfileexe
/

Xpathology-Colon-Specialist