| --- |
| license: mit |
| tags: |
| - pytorch |
| - computer-vision |
| - medical-imaging |
| - multi-task-learning |
| - classification |
| - regression |
| - single-cell |
| - microscopy |
| - white-blood-cell |
| datasets: |
| - BSCCM |
| - BCCD |
| language: |
| - en |
| pipeline_tag: image-classification |
| --- |
| |
| # Single-Cell Phenotyping — Hybrid CNN-ViT Multi-Task Model |
|
|
| **Paper:** [Towards Label-Free Single-Cell Phenotyping Using Multi-Task Learning](https://arxiv.org/abs/2605.14717) |
| **Authors:** Saqib Nazir and Ardhendu Behera (Edge Hill University, UK) |
| **Conference:** ICPR 2026 |
| **Code:** [github.com/saqibnaziir/Single-Cell-Phenotyping](https://github.com/saqibnaziir/Single-Cell-Phenotyping) |
|
|
| --- |
|
|
| ## Model Description |
|
|
| A unified deep learning framework that jointly performs **White Blood Cell (WBC) classification** and **continuous protein-expression regression** from label-free Differential Phase Contrast (DPC) microscopy images — *without fluorescent staining*. |
|
|
| ### Architecture |
|
|
| ``` |
| Input: (B, 4, 28, 28) ← 4-channel DPC (Left, Right, Top, Bottom) |
| │ |
| └─ Shared ECA Channel Attention |
| ├─ CNN Branch: Stem → Inception×2 + Residual → (B, 196, 192) |
| └─ ViT Branch: Patch(4×4) → Transformer×2 → (B, 50, 128) |
| │ |
| └─ Cross-Modal Fusion (256-dim, learnable weights) |
| ├─ Task-Specific Refinement |
| └─ Task Gating (bidirectional cross-task exchange) |
| ├─ Classification Head → (B, 3) |
| └─ Regression Head → (B, 4) |
| ``` |
|
|
| - **Parameters:** ~12M |
| - **FLOPs:** ~0.8 GFLOPs per 28×28 image (real-time capable) |
| - **Input:** 4-channel DPC images, 28×28 pixels |
|
|
| ### Tasks |
|
|
| | Task | Output | Classes / Markers | |
| |---|---|---| |
| | WBC Classification | 3 classes | Lymphocyte, Granulocyte, Monocyte | |
| | Protein Regression | 4 markers | CD45, CD3, CD19, CD14 | |
|
|
| --- |
|
|
| ## Files in This Repository |
|
|
| | File | Description | |
| |---|---| |
| | `bsccm_best_model.pth` | Best checkpoint on BSCCM dataset (91.3% accuracy) | |
| | `bccd_best_model.pth` | Best checkpoint on BCCD benchmark (93.77% accuracy) | |
|
|
| --- |
|
|
| ## Performance |
|
|
| ### BSCCM Dataset |
|
|
| | Metric | Value | |
| |---|---| |
| | WBC Classification Accuracy | **91.3%** | |
| | Macro F1-Score | **0.92** | |
| | Pearson r (CD16 regression) | **0.73** | |
| | RMSE | 0.68 | |
|
|
| ### BCCD Benchmark (Classification Only) |
|
|
| | Class | Precision | Recall | F1 | |
| |---|---|---|---| |
| | Lymphocyte | 1.000 | 1.000 | 1.000 | |
| | Granulocyte | 0.889 | 1.000 | 0.941 | |
| | Monocyte | 1.000 | 0.750 | 0.857 | |
| | **Macro Avg.** | **0.963** | **0.917** | **0.933** | |
|
|
| Overall BCCD accuracy: **93.77%** |
|
|
| --- |
|
|
| ## Usage |
|
|
| ### Installation |
|
|
| ```bash |
| git clone https://github.com/saqibnaziir/Single-Cell-Phenotyping.git |
| cd Single-Cell-Phenotyping |
| pip install -r requirements.txt |
| ``` |
|
|
| ### Load Model |
|
|
| ```python |
| import torch |
| from huggingface_hub import hf_hub_download |
| from model import create_model |
| |
| # Download checkpoint |
| ckpt_path = hf_hub_download( |
| repo_id="saqialii/single-cell-phenotyping", |
| filename="bsccm_best_model.pth" |
| ) |
| |
| # Create model and load weights |
| model = create_model(num_classes=3, num_proteins=4, img_size=28, in_channels=4) |
| checkpoint = torch.load(ckpt_path, map_location='cpu', weights_only=False) |
| model.load_state_dict(checkpoint['model_state_dict']) |
| model.eval() |
| |
| print(f"Loaded model — best val accuracy: {checkpoint['best_val_acc']:.2f}%") |
| ``` |
|
|
| ### Inference |
|
|
| ```python |
| import torch |
| import torch.nn.functional as F |
| |
| # Input: (B, 4, 28, 28) — 4-channel DPC image, normalised to [-1, 1] |
| image = torch.randn(1, 4, 28, 28) # replace with real image |
| |
| with torch.no_grad(): |
| cls_logits, prot_preds = model(image) |
| |
| # Cell type classification |
| probs = F.softmax(cls_logits, dim=1) |
| class_names = ['Lymphocyte', 'Granulocyte', 'Monocyte'] |
| predicted_class = class_names[probs.argmax().item()] |
| confidence = probs.max().item() |
| |
| print(f"Predicted: {predicted_class} ({confidence:.1%})") |
| |
| # Protein expression (Z-scored) |
| protein_names = ['CD45', 'CD3', 'CD19', 'CD14'] |
| for name, val in zip(protein_names, prot_preds[0].tolist()): |
| print(f" {name}: {val:.3f}") |
| ``` |
|
|
| ### Data Preparation |
|
|
| ```bash |
| # Download BSCCM dataset |
| pip install bsccm |
| python -c "from bsccm import download_dataset; download_dataset('./data', mnist=True)" |
| |
| # Train from scratch |
| python train.py --data_path ./data/BSCCMNIST --save_dir checkpoints/run1 |
| |
| # Evaluate |
| python evaluate.py \ |
| --model_path checkpoints/run1/best_model.pth \ |
| --data_path ./data/BSCCMNIST \ |
| --output_dir evaluation_results |
| ``` |
|
|
| --- |
|
|
| ## Dataset |
|
|
| **BSCCM** (Berkeley Single Cell Computational Microscopy): |
| - 7,889 single-cell DPC images at 28×28 pixels |
| - 3 WBC classes: Lymphocyte (456), Granulocyte (736), Monocyte (226) — test split |
| - 4 protein markers measured by fluorescence: CD45, CD3, CD19, CD14 |
| - Source: [Waller-Lab/BSCCM](https://github.com/Waller-Lab/BSCCM) |
|
|
| **BCCD** (Blood Cell Images): |
| - ~12,000 RGB images at 128×128 pixels (4-class → mapped to 3-class) |
| - Source: [Kaggle: Blood Cell Images](https://www.kaggle.com/datasets/paultimothymooney/blood-cells) |
|
|
| --- |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{nazir2026label, |
| title = {Towards Label-Free Single-Cell Phenotyping Using Multi-Task Learning}, |
| author = {Nazir, Saqib and Behera, Ardhendu}, |
| booktitle = {Proceedings of the International Conference on Pattern Recognition (ICPR)}, |
| year = {2026}, |
| note = {arXiv:2605.14717} |
| } |
| ``` |
|
|
| ```bibtex |
| @article{pinkard2024berkeley, |
| title = {Berkeley Single Cell Computational Microscopy Dataset}, |
| author = {Pinkard, Henry and others}, |
| journal = {arXiv preprint arXiv:2402.06191}, |
| year = {2024} |
| } |
| ``` |
|
|
| --- |
|
|
| ## License |
|
|
| [MIT License](https://github.com/saqibnaziir/Single-Cell-Phenotyping/blob/main/LICENSE) — Saqib Nazir, Ardhendu Behera, Edge Hill University, 2026 |
|
|