ClinVar Evidence/Conclusion/Description Classifier

A fine-tuned bert-base-uncased model for classifying sentences in ClinVar submission comments into three categories: description, evidence, or conclusion.

Model Description

This model is used to preprocess ClinVar submission comments by identifying and filtering out sentences that contain conclusions and descriptions, retaining only sentences that convey evidence for downstream analysis.

Base model: bert-base-uncased Task: Single-label sequence classification (3 classes)

Labels

ID Label Meaning
0 LABEL_0 Evidence
1 LABEL_1 Description
2 LABEL_2 Conclusion

Usage

from transformers import BertForSequenceClassification, BertTokenizer
import torch

model = BertForSequenceClassification.from_pretrained("weijiang99/clinvar-evidence-conclusion-classifier")
tokenizer = BertTokenizer.from_pretrained("weijiang99/clinvar-evidence-conclusion-classifier")

model.eval()
sentences = ["The variant was observed in 3 affected family members.", "This variant is likely pathogenic."]
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.argmax(outputs.logits, dim=1)

label_map = {0: "evidence", 1: "description", 2: "conclusion"}
for sentence, pred in zip(sentences, predictions):
    print(f"{label_map[pred.item()]}: {sentence}")

Intended Use

Preprocessing ClinVar variant interpretation comments to extract evidence sentences for downstream natural language processing tasks.

Downloads last month
13
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for weijiang99/clinvar-evidence-conclusion-classifier

Finetuned
(6626)
this model