nilaybhatt's picture
Update README.md
555e705 verified

PD Cognition BioClinicalBERT Fine Tuned spaCy Model

This model is a spaCy pipeline for span categorization trained on clinical text related to Parkinson disease cognition. It uses BioClinicalBERT as the transformer backbone and predicts labeled spans stored under the key sc.

Model Details

Base model emilyalsentzer Bio ClinicalBERT Framework spaCy with spacy transformers Task span categorization Language English

Labels

The label set is defined in labels.json included in the model directory.

Input:

Raw clinical text string

Output:

Predicted spans in doc.spans["sc"] with start end label and score

Usage

import spacy

nlp = spacy.load("path_to_model_best")
doc = nlp("Patient shows cognitive decline and memory impairment")

for span in doc.spans["sc"]:
    print(span.text, span.label_, span.score)

Files

model best spaCy pipeline config cfg training configuration meta json pipeline metadata labels json list of span labels

Notes

This is a spaCy model and should be loaded with spacy.load not transformers Performance depends on span alignment and threshold tuning Intended for research use on clinical text

Citation

@article{khanna2024cognitive,
  title={Toward Automated Cognitive Assessment in Parkinson’s Disease Using Pretrained Language Models},
  author={Khanna, Varada and Bhatt, Nilay and Shin, Ikgyu and Rosso, Mattia and Tinaz, Sule and Ren, Yang and Xu, Hua and Keloth, Vipina K},
  journal={arXiv preprint arXiv:2511.08806},
  year={2025}
}