ClinVar Evidence/Conclusion/Description Classifier
A fine-tuned bert-base-uncased model for classifying sentences in ClinVar submission comments into three categories: description, evidence, or conclusion.
Model Description
This model is used to preprocess ClinVar submission comments by identifying and filtering out sentences that contain conclusions and descriptions, retaining only sentences that convey evidence for downstream analysis.
Base model: bert-base-uncased
Task: Single-label sequence classification (3 classes)
Labels
| ID | Label | Meaning |
|---|---|---|
| 0 | LABEL_0 | Evidence |
| 1 | LABEL_1 | Description |
| 2 | LABEL_2 | Conclusion |
Usage
from transformers import BertForSequenceClassification, BertTokenizer
import torch
model = BertForSequenceClassification.from_pretrained("weijiang99/clinvar-evidence-conclusion-classifier")
tokenizer = BertTokenizer.from_pretrained("weijiang99/clinvar-evidence-conclusion-classifier")
model.eval()
sentences = ["The variant was observed in 3 affected family members.", "This variant is likely pathogenic."]
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.argmax(outputs.logits, dim=1)
label_map = {0: "evidence", 1: "description", 2: "conclusion"}
for sentence, pred in zip(sentences, predictions):
print(f"{label_map[pred.item()]}: {sentence}")
Intended Use
Preprocessing ClinVar variant interpretation comments to extract evidence sentences for downstream natural language processing tasks.
- Downloads last month
- 13
Model tree for weijiang99/clinvar-evidence-conclusion-classifier
Base model
google-bert/bert-base-uncased