---
license: mit
tags:
  - pytorch
  - computer-vision
  - medical-imaging
  - multi-task-learning
  - classification
  - regression
  - single-cell
  - microscopy
  - white-blood-cell
datasets:
  - BSCCM
  - BCCD
language:
  - en
pipeline_tag: image-classification
---

# Single-Cell Phenotyping — Hybrid CNN-ViT Multi-Task Model

**Paper:** [Towards Label-Free Single-Cell Phenotyping Using Multi-Task Learning](https://arxiv.org/abs/2605.14717)  
**Authors:** Saqib Nazir and Ardhendu Behera (Edge Hill University, UK)  
**Conference:** ICPR 2026  
**Code:** [github.com/saqibnaziir/Single-Cell-Phenotyping](https://github.com/saqibnaziir/Single-Cell-Phenotyping)

---

## Model Description

A unified deep learning framework that jointly performs **White Blood Cell (WBC) classification** and **continuous protein-expression regression** from label-free Differential Phase Contrast (DPC) microscopy images — *without fluorescent staining*.

### Architecture

```
Input: (B, 4, 28, 28)  ← 4-channel DPC (Left, Right, Top, Bottom)
    │
    └─ Shared ECA Channel Attention
           ├─ CNN Branch: Stem → Inception×2 + Residual → (B, 196, 192)
           └─ ViT Branch: Patch(4×4) → Transformer×2 → (B, 50, 128)
                  │
                  └─ Cross-Modal Fusion (256-dim, learnable weights)
                         ├─ Task-Specific Refinement
                         └─ Task Gating (bidirectional cross-task exchange)
                                ├─ Classification Head → (B, 3)
                                └─ Regression Head    → (B, 4)
```

- **Parameters:** ~12M
- **FLOPs:** ~0.8 GFLOPs per 28×28 image (real-time capable)
- **Input:** 4-channel DPC images, 28×28 pixels

### Tasks

| Task | Output | Classes / Markers |
|---|---|---|
| WBC Classification | 3 classes | Lymphocyte, Granulocyte, Monocyte |
| Protein Regression | 4 markers | CD45, CD3, CD19, CD14 |

---

## Files in This Repository

| File | Description |
|---|---|
| `bsccm_best_model.pth` | Best checkpoint on BSCCM dataset (91.3% accuracy) |
| `bccd_best_model.pth` | Best checkpoint on BCCD benchmark (93.77% accuracy) |

---

## Performance

### BSCCM Dataset

| Metric | Value |
|---|---|
| WBC Classification Accuracy | **91.3%** |
| Macro F1-Score | **0.92** |
| Pearson r (CD16 regression) | **0.73** |
| RMSE | 0.68 |

### BCCD Benchmark (Classification Only)

| Class | Precision | Recall | F1 |
|---|---|---|---|
| Lymphocyte | 1.000 | 1.000 | 1.000 |
| Granulocyte | 0.889 | 1.000 | 0.941 |
| Monocyte | 1.000 | 0.750 | 0.857 |
| **Macro Avg.** | **0.963** | **0.917** | **0.933** |

Overall BCCD accuracy: **93.77%**

---

## Usage

### Installation

```bash
git clone https://github.com/saqibnaziir/Single-Cell-Phenotyping.git
cd Single-Cell-Phenotyping
pip install -r requirements.txt
```

### Load Model

```python
import torch
from huggingface_hub import hf_hub_download
from model import create_model

# Download checkpoint
ckpt_path = hf_hub_download(
    repo_id="saqialii/single-cell-phenotyping",
    filename="bsccm_best_model.pth"
)

# Create model and load weights
model = create_model(num_classes=3, num_proteins=4, img_size=28, in_channels=4)
checkpoint = torch.load(ckpt_path, map_location='cpu', weights_only=False)
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

print(f"Loaded model — best val accuracy: {checkpoint['best_val_acc']:.2f}%")
```

### Inference

```python
import torch
import torch.nn.functional as F

# Input: (B, 4, 28, 28) — 4-channel DPC image, normalised to [-1, 1]
image = torch.randn(1, 4, 28, 28)  # replace with real image

with torch.no_grad():
    cls_logits, prot_preds = model(image)

# Cell type classification
probs = F.softmax(cls_logits, dim=1)
class_names = ['Lymphocyte', 'Granulocyte', 'Monocyte']
predicted_class = class_names[probs.argmax().item()]
confidence = probs.max().item()

print(f"Predicted: {predicted_class} ({confidence:.1%})")

# Protein expression (Z-scored)
protein_names = ['CD45', 'CD3', 'CD19', 'CD14']
for name, val in zip(protein_names, prot_preds[0].tolist()):
    print(f"  {name}: {val:.3f}")
```

### Data Preparation

```bash
# Download BSCCM dataset
pip install bsccm
python -c "from bsccm import download_dataset; download_dataset('./data', mnist=True)"

# Train from scratch
python train.py --data_path ./data/BSCCMNIST --save_dir checkpoints/run1

# Evaluate
python evaluate.py \
    --model_path checkpoints/run1/best_model.pth \
    --data_path ./data/BSCCMNIST \
    --output_dir evaluation_results
```

---

## Dataset

**BSCCM** (Berkeley Single Cell Computational Microscopy):
- 7,889 single-cell DPC images at 28×28 pixels
- 3 WBC classes: Lymphocyte (456), Granulocyte (736), Monocyte (226) — test split
- 4 protein markers measured by fluorescence: CD45, CD3, CD19, CD14
- Source: [Waller-Lab/BSCCM](https://github.com/Waller-Lab/BSCCM)

**BCCD** (Blood Cell Images):
- ~12,000 RGB images at 128×128 pixels (4-class → mapped to 3-class)
- Source: [Kaggle: Blood Cell Images](https://www.kaggle.com/datasets/paultimothymooney/blood-cells)

---

## Citation

```bibtex
@inproceedings{nazir2026label,
  title     = {Towards Label-Free Single-Cell Phenotyping Using Multi-Task Learning},
  author    = {Nazir, Saqib and Behera, Ardhendu},
  booktitle = {Proceedings of the International Conference on Pattern Recognition (ICPR)},
  year      = {2026},
  note      = {arXiv:2605.14717}
}
```

```bibtex
@article{pinkard2024berkeley,
  title   = {Berkeley Single Cell Computational Microscopy Dataset},
  author  = {Pinkard, Henry and others},
  journal = {arXiv preprint arXiv:2402.06191},
  year    = {2024}
}
```

---

## License

[MIT License](https://github.com/saqibnaziir/Single-Cell-Phenotyping/blob/main/LICENSE) — Saqib Nazir, Ardhendu Behera, Edge Hill University, 2026