Geneformer-OA-Myeloid

Geneformer-OA-Myeloid is a released collection of two Geneformer-based downstream classifiers for osteoarthritis synovial myeloid single-cell analysis.

Released Models

Model Task Training data Labels
disease_classifier OA vs Normal classification GSE216651 Normal, OA
state_classifier Synovial myeloid state classification GSE216651 + GSE152805 Monocyte_like, Inflammatory_macrophage, Resident_macrophage

Performance

Disease classifier

  • Base model: Geneformer-V2-104M
  • Internal held-out accuracy: 0.9961
  • Internal held-out macro-F1: 0.9956
  • Internal held-out ROC-AUC: 0.99997

State classifier

  • Base model: Geneformer-V2-104M
  • Internal held-out accuracy: 0.9507
  • Internal held-out macro-F1: 0.8985

External validation

  • Cohort: GSE253198
  • OA/Normal transfer accuracy: 0.8146
  • OA/Normal transfer macro-F1: 0.4704
  • OA ROC-AUC: 0.5184

The disease classifier performs strongly inside the source synovial cohort but shows limited cross-cohort transfer, so this release should be used as a research model rather than a cohort-invariant clinical classifier.

Repository Contents

models/
  disease_classifier/
  state_classifier/
results/
  external_validation/
scripts/

Usage

This release distributes model metadata, evaluation summaries, and helper scripts through GitHub, while large binary weights are intended for Hugging Face or Git LFS distribution.

Minimal loading pattern:

from pathlib import Path
from transformers import BertForSequenceClassification

model_dir = Path("models/disease_classifier")
model = BertForSequenceClassification.from_pretrained(model_dir)

Scope

  • Species: human
  • Tissue focus: synovium-derived OA myeloid cells
  • Intended use: representation learning, disease-state scoring, and method development in related OA synovial datasets
  • Not intended for: diagnosis, prognosis, treatment selection, or unsupported cross-tissue deployment

Data Provenance

This release does not redistribute raw single-cell count matrices. The released models and summaries were generated from public datasets including:

  • GSE216651
  • GSE152805
  • GSE253198

License and Attribution

This release is distributed under Apache License 2.0 as a derivative downstream work of ctheodoris/Geneformer.

  • Upstream foundation model: ctheodoris/Geneformer
  • Proper description: A Geneformer-based OA myeloid fine-tuned model collection
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for BowenXiao/geneformer-oa-myeloid

Finetuned
(13)
this model