saqialii's picture
Add model card
2f24079 verified
---
license: mit
tags:
- pytorch
- computer-vision
- medical-imaging
- multi-task-learning
- classification
- regression
- single-cell
- microscopy
- white-blood-cell
datasets:
- BSCCM
- BCCD
language:
- en
pipeline_tag: image-classification
---
# Single-Cell Phenotyping — Hybrid CNN-ViT Multi-Task Model
**Paper:** [Towards Label-Free Single-Cell Phenotyping Using Multi-Task Learning](https://arxiv.org/abs/2605.14717)
**Authors:** Saqib Nazir and Ardhendu Behera (Edge Hill University, UK)
**Conference:** ICPR 2026
**Code:** [github.com/saqibnaziir/Single-Cell-Phenotyping](https://github.com/saqibnaziir/Single-Cell-Phenotyping)
---
## Model Description
A unified deep learning framework that jointly performs **White Blood Cell (WBC) classification** and **continuous protein-expression regression** from label-free Differential Phase Contrast (DPC) microscopy images — *without fluorescent staining*.
### Architecture
```
Input: (B, 4, 28, 28) ← 4-channel DPC (Left, Right, Top, Bottom)
└─ Shared ECA Channel Attention
├─ CNN Branch: Stem → Inception×2 + Residual → (B, 196, 192)
└─ ViT Branch: Patch(4×4) → Transformer×2 → (B, 50, 128)
└─ Cross-Modal Fusion (256-dim, learnable weights)
├─ Task-Specific Refinement
└─ Task Gating (bidirectional cross-task exchange)
├─ Classification Head → (B, 3)
└─ Regression Head → (B, 4)
```
- **Parameters:** ~12M
- **FLOPs:** ~0.8 GFLOPs per 28×28 image (real-time capable)
- **Input:** 4-channel DPC images, 28×28 pixels
### Tasks
| Task | Output | Classes / Markers |
|---|---|---|
| WBC Classification | 3 classes | Lymphocyte, Granulocyte, Monocyte |
| Protein Regression | 4 markers | CD45, CD3, CD19, CD14 |
---
## Files in This Repository
| File | Description |
|---|---|
| `bsccm_best_model.pth` | Best checkpoint on BSCCM dataset (91.3% accuracy) |
| `bccd_best_model.pth` | Best checkpoint on BCCD benchmark (93.77% accuracy) |
---
## Performance
### BSCCM Dataset
| Metric | Value |
|---|---|
| WBC Classification Accuracy | **91.3%** |
| Macro F1-Score | **0.92** |
| Pearson r (CD16 regression) | **0.73** |
| RMSE | 0.68 |
### BCCD Benchmark (Classification Only)
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Lymphocyte | 1.000 | 1.000 | 1.000 |
| Granulocyte | 0.889 | 1.000 | 0.941 |
| Monocyte | 1.000 | 0.750 | 0.857 |
| **Macro Avg.** | **0.963** | **0.917** | **0.933** |
Overall BCCD accuracy: **93.77%**
---
## Usage
### Installation
```bash
git clone https://github.com/saqibnaziir/Single-Cell-Phenotyping.git
cd Single-Cell-Phenotyping
pip install -r requirements.txt
```
### Load Model
```python
import torch
from huggingface_hub import hf_hub_download
from model import create_model
# Download checkpoint
ckpt_path = hf_hub_download(
repo_id="saqialii/single-cell-phenotyping",
filename="bsccm_best_model.pth"
)
# Create model and load weights
model = create_model(num_classes=3, num_proteins=4, img_size=28, in_channels=4)
checkpoint = torch.load(ckpt_path, map_location='cpu', weights_only=False)
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()
print(f"Loaded model — best val accuracy: {checkpoint['best_val_acc']:.2f}%")
```
### Inference
```python
import torch
import torch.nn.functional as F
# Input: (B, 4, 28, 28) — 4-channel DPC image, normalised to [-1, 1]
image = torch.randn(1, 4, 28, 28) # replace with real image
with torch.no_grad():
cls_logits, prot_preds = model(image)
# Cell type classification
probs = F.softmax(cls_logits, dim=1)
class_names = ['Lymphocyte', 'Granulocyte', 'Monocyte']
predicted_class = class_names[probs.argmax().item()]
confidence = probs.max().item()
print(f"Predicted: {predicted_class} ({confidence:.1%})")
# Protein expression (Z-scored)
protein_names = ['CD45', 'CD3', 'CD19', 'CD14']
for name, val in zip(protein_names, prot_preds[0].tolist()):
print(f" {name}: {val:.3f}")
```
### Data Preparation
```bash
# Download BSCCM dataset
pip install bsccm
python -c "from bsccm import download_dataset; download_dataset('./data', mnist=True)"
# Train from scratch
python train.py --data_path ./data/BSCCMNIST --save_dir checkpoints/run1
# Evaluate
python evaluate.py \
--model_path checkpoints/run1/best_model.pth \
--data_path ./data/BSCCMNIST \
--output_dir evaluation_results
```
---
## Dataset
**BSCCM** (Berkeley Single Cell Computational Microscopy):
- 7,889 single-cell DPC images at 28×28 pixels
- 3 WBC classes: Lymphocyte (456), Granulocyte (736), Monocyte (226) — test split
- 4 protein markers measured by fluorescence: CD45, CD3, CD19, CD14
- Source: [Waller-Lab/BSCCM](https://github.com/Waller-Lab/BSCCM)
**BCCD** (Blood Cell Images):
- ~12,000 RGB images at 128×128 pixels (4-class → mapped to 3-class)
- Source: [Kaggle: Blood Cell Images](https://www.kaggle.com/datasets/paultimothymooney/blood-cells)
---
## Citation
```bibtex
@inproceedings{nazir2026label,
title = {Towards Label-Free Single-Cell Phenotyping Using Multi-Task Learning},
author = {Nazir, Saqib and Behera, Ardhendu},
booktitle = {Proceedings of the International Conference on Pattern Recognition (ICPR)},
year = {2026},
note = {arXiv:2605.14717}
}
```
```bibtex
@article{pinkard2024berkeley,
title = {Berkeley Single Cell Computational Microscopy Dataset},
author = {Pinkard, Henry and others},
journal = {arXiv preprint arXiv:2402.06191},
year = {2024}
}
```
---
## License
[MIT License](https://github.com/saqibnaziir/Single-Cell-Phenotyping/blob/main/LICENSE) — Saqib Nazir, Ardhendu Behera, Edge Hill University, 2026