--- license: mit tags: - pytorch - computer-vision - medical-imaging - multi-task-learning - classification - regression - single-cell - microscopy - white-blood-cell datasets: - BSCCM - BCCD language: - en pipeline_tag: image-classification --- # Single-Cell Phenotyping — Hybrid CNN-ViT Multi-Task Model **Paper:** [Towards Label-Free Single-Cell Phenotyping Using Multi-Task Learning](https://arxiv.org/abs/2605.14717) **Authors:** Saqib Nazir and Ardhendu Behera (Edge Hill University, UK) **Conference:** ICPR 2026 **Code:** [github.com/saqibnaziir/Single-Cell-Phenotyping](https://github.com/saqibnaziir/Single-Cell-Phenotyping) --- ## Model Description A unified deep learning framework that jointly performs **White Blood Cell (WBC) classification** and **continuous protein-expression regression** from label-free Differential Phase Contrast (DPC) microscopy images — *without fluorescent staining*. ### Architecture ``` Input: (B, 4, 28, 28) ← 4-channel DPC (Left, Right, Top, Bottom) │ └─ Shared ECA Channel Attention ├─ CNN Branch: Stem → Inception×2 + Residual → (B, 196, 192) └─ ViT Branch: Patch(4×4) → Transformer×2 → (B, 50, 128) │ └─ Cross-Modal Fusion (256-dim, learnable weights) ├─ Task-Specific Refinement └─ Task Gating (bidirectional cross-task exchange) ├─ Classification Head → (B, 3) └─ Regression Head → (B, 4) ``` - **Parameters:** ~12M - **FLOPs:** ~0.8 GFLOPs per 28×28 image (real-time capable) - **Input:** 4-channel DPC images, 28×28 pixels ### Tasks | Task | Output | Classes / Markers | |---|---|---| | WBC Classification | 3 classes | Lymphocyte, Granulocyte, Monocyte | | Protein Regression | 4 markers | CD45, CD3, CD19, CD14 | --- ## Files in This Repository | File | Description | |---|---| | `bsccm_best_model.pth` | Best checkpoint on BSCCM dataset (91.3% accuracy) | | `bccd_best_model.pth` | Best checkpoint on BCCD benchmark (93.77% accuracy) | --- ## Performance ### BSCCM Dataset | Metric | Value | |---|---| | WBC Classification Accuracy | **91.3%** | | Macro F1-Score | **0.92** | | Pearson r (CD16 regression) | **0.73** | | RMSE | 0.68 | ### BCCD Benchmark (Classification Only) | Class | Precision | Recall | F1 | |---|---|---|---| | Lymphocyte | 1.000 | 1.000 | 1.000 | | Granulocyte | 0.889 | 1.000 | 0.941 | | Monocyte | 1.000 | 0.750 | 0.857 | | **Macro Avg.** | **0.963** | **0.917** | **0.933** | Overall BCCD accuracy: **93.77%** --- ## Usage ### Installation ```bash git clone https://github.com/saqibnaziir/Single-Cell-Phenotyping.git cd Single-Cell-Phenotyping pip install -r requirements.txt ``` ### Load Model ```python import torch from huggingface_hub import hf_hub_download from model import create_model # Download checkpoint ckpt_path = hf_hub_download( repo_id="saqialii/single-cell-phenotyping", filename="bsccm_best_model.pth" ) # Create model and load weights model = create_model(num_classes=3, num_proteins=4, img_size=28, in_channels=4) checkpoint = torch.load(ckpt_path, map_location='cpu', weights_only=False) model.load_state_dict(checkpoint['model_state_dict']) model.eval() print(f"Loaded model — best val accuracy: {checkpoint['best_val_acc']:.2f}%") ``` ### Inference ```python import torch import torch.nn.functional as F # Input: (B, 4, 28, 28) — 4-channel DPC image, normalised to [-1, 1] image = torch.randn(1, 4, 28, 28) # replace with real image with torch.no_grad(): cls_logits, prot_preds = model(image) # Cell type classification probs = F.softmax(cls_logits, dim=1) class_names = ['Lymphocyte', 'Granulocyte', 'Monocyte'] predicted_class = class_names[probs.argmax().item()] confidence = probs.max().item() print(f"Predicted: {predicted_class} ({confidence:.1%})") # Protein expression (Z-scored) protein_names = ['CD45', 'CD3', 'CD19', 'CD14'] for name, val in zip(protein_names, prot_preds[0].tolist()): print(f" {name}: {val:.3f}") ``` ### Data Preparation ```bash # Download BSCCM dataset pip install bsccm python -c "from bsccm import download_dataset; download_dataset('./data', mnist=True)" # Train from scratch python train.py --data_path ./data/BSCCMNIST --save_dir checkpoints/run1 # Evaluate python evaluate.py \ --model_path checkpoints/run1/best_model.pth \ --data_path ./data/BSCCMNIST \ --output_dir evaluation_results ``` --- ## Dataset **BSCCM** (Berkeley Single Cell Computational Microscopy): - 7,889 single-cell DPC images at 28×28 pixels - 3 WBC classes: Lymphocyte (456), Granulocyte (736), Monocyte (226) — test split - 4 protein markers measured by fluorescence: CD45, CD3, CD19, CD14 - Source: [Waller-Lab/BSCCM](https://github.com/Waller-Lab/BSCCM) **BCCD** (Blood Cell Images): - ~12,000 RGB images at 128×128 pixels (4-class → mapped to 3-class) - Source: [Kaggle: Blood Cell Images](https://www.kaggle.com/datasets/paultimothymooney/blood-cells) --- ## Citation ```bibtex @inproceedings{nazir2026label, title = {Towards Label-Free Single-Cell Phenotyping Using Multi-Task Learning}, author = {Nazir, Saqib and Behera, Ardhendu}, booktitle = {Proceedings of the International Conference on Pattern Recognition (ICPR)}, year = {2026}, note = {arXiv:2605.14717} } ``` ```bibtex @article{pinkard2024berkeley, title = {Berkeley Single Cell Computational Microscopy Dataset}, author = {Pinkard, Henry and others}, journal = {arXiv preprint arXiv:2402.06191}, year = {2024} } ``` --- ## License [MIT License](https://github.com/saqibnaziir/Single-Cell-Phenotyping/blob/main/LICENSE) — Saqib Nazir, Ardhendu Behera, Edge Hill University, 2026