Geneformer-OA-Myeloid
Geneformer-OA-Myeloid is a released collection of two Geneformer-based downstream classifiers for osteoarthritis synovial myeloid single-cell analysis.
Released Models
| Model | Task | Training data | Labels |
|---|---|---|---|
disease_classifier |
OA vs Normal classification | GSE216651 |
Normal, OA |
state_classifier |
Synovial myeloid state classification | GSE216651 + GSE152805 |
Monocyte_like, Inflammatory_macrophage, Resident_macrophage |
Performance
Disease classifier
- Base model:
Geneformer-V2-104M - Internal held-out accuracy:
0.9961 - Internal held-out macro-F1:
0.9956 - Internal held-out ROC-AUC:
0.99997
State classifier
- Base model:
Geneformer-V2-104M - Internal held-out accuracy:
0.9507 - Internal held-out macro-F1:
0.8985
External validation
- Cohort:
GSE253198 - OA/Normal transfer accuracy:
0.8146 - OA/Normal transfer macro-F1:
0.4704 - OA ROC-AUC:
0.5184
The disease classifier performs strongly inside the source synovial cohort but shows limited cross-cohort transfer, so this release should be used as a research model rather than a cohort-invariant clinical classifier.
Repository Contents
models/
disease_classifier/
state_classifier/
results/
external_validation/
scripts/
Usage
This release distributes model metadata, evaluation summaries, and helper scripts through GitHub, while large binary weights are intended for Hugging Face or Git LFS distribution.
Minimal loading pattern:
from pathlib import Path
from transformers import BertForSequenceClassification
model_dir = Path("models/disease_classifier")
model = BertForSequenceClassification.from_pretrained(model_dir)
Scope
- Species: human
- Tissue focus: synovium-derived OA myeloid cells
- Intended use: representation learning, disease-state scoring, and method development in related OA synovial datasets
- Not intended for: diagnosis, prognosis, treatment selection, or unsupported cross-tissue deployment
Data Provenance
This release does not redistribute raw single-cell count matrices. The released models and summaries were generated from public datasets including:
GSE216651GSE152805GSE253198
License and Attribution
This release is distributed under Apache License 2.0 as a derivative downstream work of ctheodoris/Geneformer.
- Upstream foundation model:
ctheodoris/Geneformer - Proper description:
A Geneformer-based OA myeloid fine-tuned model collection
Model tree for BowenXiao/geneformer-oa-myeloid
Base model
ctheodoris/Geneformer