Add model card

2f24079 verified 7 days ago

5.87 kB

	---
	license: mit
	tags:
	- pytorch
	- computer-vision
	- medical-imaging
	- multi-task-learning
	- classification
	- regression
	- single-cell
	- microscopy
	- white-blood-cell
	datasets:
	- BSCCM
	- BCCD
	language:
	- en
	pipeline_tag: image-classification
	---

	# Single-Cell Phenotyping — Hybrid CNN-ViT Multi-Task Model

	Paper: [Towards Label-Free Single-Cell Phenotyping Using Multi-Task Learning](https://arxiv.org/abs/2605.14717)
	Authors: Saqib Nazir and Ardhendu Behera (Edge Hill University, UK)
	Conference: ICPR 2026
	Code: [github.com/saqibnaziir/Single-Cell-Phenotyping](https://github.com/saqibnaziir/Single-Cell-Phenotyping)

	---

	## Model Description

	A unified deep learning framework that jointly performs White Blood Cell (WBC) classification and continuous protein-expression regression from label-free Differential Phase Contrast (DPC) microscopy images — without fluorescent staining.

	### Architecture

	```
	Input: (B, 4, 28, 28) ← 4-channel DPC (Left, Right, Top, Bottom)
	│
	└─ Shared ECA Channel Attention
	├─ CNN Branch: Stem → Inception×2 + Residual → (B, 196, 192)
	└─ ViT Branch: Patch(4×4) → Transformer×2 → (B, 50, 128)
	│
	└─ Cross-Modal Fusion (256-dim, learnable weights)
	├─ Task-Specific Refinement
	└─ Task Gating (bidirectional cross-task exchange)
	├─ Classification Head → (B, 3)
	└─ Regression Head → (B, 4)
	```

	- Parameters: ~12M
	- FLOPs: ~0.8 GFLOPs per 28×28 image (real-time capable)
	- Input: 4-channel DPC images, 28×28 pixels

	### Tasks

	\| Task \| Output \| Classes / Markers \|
	\|---\|---\|---\|
	\| WBC Classification \| 3 classes \| Lymphocyte, Granulocyte, Monocyte \|
	\| Protein Regression \| 4 markers \| CD45, CD3, CD19, CD14 \|

	---

	## Files in This Repository

	\| File \| Description \|
	\|---\|---\|
	\| `bsccm_best_model.pth` \| Best checkpoint on BSCCM dataset (91.3% accuracy) \|
	\| `bccd_best_model.pth` \| Best checkpoint on BCCD benchmark (93.77% accuracy) \|

	---

	## Performance

	### BSCCM Dataset

	\| Metric \| Value \|
	\|---\|---\|
	\| WBC Classification Accuracy \| 91.3% \|
	\| Macro F1-Score \| 0.92 \|
	\| Pearson r (CD16 regression) \| 0.73 \|
	\| RMSE \| 0.68 \|

	### BCCD Benchmark (Classification Only)

	\| Class \| Precision \| Recall \| F1 \|
	\|---\|---\|---\|---\|
	\| Lymphocyte \| 1.000 \| 1.000 \| 1.000 \|
	\| Granulocyte \| 0.889 \| 1.000 \| 0.941 \|
	\| Monocyte \| 1.000 \| 0.750 \| 0.857 \|
	\| Macro Avg. \| 0.963 \| 0.917 \| 0.933 \|

	Overall BCCD accuracy: 93.77%

	---

	## Usage

	### Installation

	```bash
	git clone https://github.com/saqibnaziir/Single-Cell-Phenotyping.git
	cd Single-Cell-Phenotyping
	pip install -r requirements.txt
	```

	### Load Model

	```python
	import torch
	from huggingface_hub import hf_hub_download
	from model import create_model

	# Download checkpoint
	ckpt_path = hf_hub_download(
	repo_id="saqialii/single-cell-phenotyping",
	filename="bsccm_best_model.pth"
	)

	# Create model and load weights
	model = create_model(num_classes=3, num_proteins=4, img_size=28, in_channels=4)
	checkpoint = torch.load(ckpt_path, map_location='cpu', weights_only=False)
	model.load_state_dict(checkpoint['model_state_dict'])
	model.eval()

	print(f"Loaded model — best val accuracy: {checkpoint['best_val_acc']:.2f}%")
	```

	### Inference

	```python
	import torch
	import torch.nn.functional as F

	# Input: (B, 4, 28, 28) — 4-channel DPC image, normalised to [-1, 1]
	image = torch.randn(1, 4, 28, 28) # replace with real image

	with torch.no_grad():
	cls_logits, prot_preds = model(image)

	# Cell type classification
	probs = F.softmax(cls_logits, dim=1)
	class_names = ['Lymphocyte', 'Granulocyte', 'Monocyte']
	predicted_class = class_names[probs.argmax().item()]
	confidence = probs.max().item()

	print(f"Predicted: {predicted_class} ({confidence:.1%})")

	# Protein expression (Z-scored)
	protein_names = ['CD45', 'CD3', 'CD19', 'CD14']
	for name, val in zip(protein_names, prot_preds[0].tolist()):
	print(f" {name}: {val:.3f}")
	```

	### Data Preparation

	```bash
	# Download BSCCM dataset
	pip install bsccm
	python -c "from bsccm import download_dataset; download_dataset('./data', mnist=True)"

	# Train from scratch
	python train.py --data_path ./data/BSCCMNIST --save_dir checkpoints/run1

	# Evaluate
	python evaluate.py \
	--model_path checkpoints/run1/best_model.pth \
	--data_path ./data/BSCCMNIST \
	--output_dir evaluation_results
	```

	---

	## Dataset

	BSCCM (Berkeley Single Cell Computational Microscopy):
	- 7,889 single-cell DPC images at 28×28 pixels
	- 3 WBC classes: Lymphocyte (456), Granulocyte (736), Monocyte (226) — test split
	- 4 protein markers measured by fluorescence: CD45, CD3, CD19, CD14
	- Source: [Waller-Lab/BSCCM](https://github.com/Waller-Lab/BSCCM)

	BCCD (Blood Cell Images):
	- ~12,000 RGB images at 128×128 pixels (4-class → mapped to 3-class)
	- Source: [Kaggle: Blood Cell Images](https://www.kaggle.com/datasets/paultimothymooney/blood-cells)

	---

	## Citation

	```bibtex
	@inproceedings{nazir2026label,
	title = {Towards Label-Free Single-Cell Phenotyping Using Multi-Task Learning},
	author = {Nazir, Saqib and Behera, Ardhendu},
	booktitle = {Proceedings of the International Conference on Pattern Recognition (ICPR)},
	year = {2026},
	note = {arXiv:2605.14717}
	}
	```

	```bibtex
	@article{pinkard2024berkeley,
	title = {Berkeley Single Cell Computational Microscopy Dataset},
	author = {Pinkard, Henry and others},
	journal = {arXiv preprint arXiv:2402.06191},
	year = {2024}
	}
	```

	---

	## License

	[MIT License](https://github.com/saqibnaziir/Single-Cell-Phenotyping/blob/main/LICENSE) — Saqib Nazir, Ardhendu Behera, Edge Hill University, 2026