Model Card for smollm3-discharge-sentences-sft
This model is a fine-tuned version of HuggingFaceTB/SmolLM3-3B-Base for clinical sentence classification. It has been trained using TRL.
Model Description
This model classifies individual sentences from hospital discharge summaries into categories of follow-up actions. Given a sentence, it outputs a JSON object indicating which action categories apply.
Categories
- instructions: Case-specific instructions for the patient
- appointment: Appointment-related followup
- medication: Medication-related followups
- lab: Lab-related followup
- procedure: Procedure-related followup
- imaging: Imaging-related followup
- other: Other helpful contextual information
- none: Not an action item
Quick start
from transformers import pipeline
sentence = "The patient was to follow up with Dr. Greene in three to four weeks."
generator = pipeline("text-generation", model="chrisvoncsefalvay/smollm3-discharge-sentences-sft", device="cuda")
output = generator([
{"role": "system", "content": "You are a clinical action item classifier..."},
{"role": "user", "content": f"Classify this sentence:\n\n{sentence}"}
], max_new_tokens=64, return_full_text=False)[0]
print(output["generated_text"])
# Output: {"categories": ["appointment"]}
Evaluation Results
Evaluated on 5,313 test samples from chrisvoncsefalvay/smol-discharge-sentences-sft.
Overall Metrics
| Metric | Score |
|---|---|
| JSON Validity | 97.5% |
| Exact Match Accuracy | 76.3% |
| Micro F1 | 0.631 |
| Macro F1 | 0.568 |
| Micro Precision | 0.734 |
| Micro Recall | 0.553 |
Per-Category Performance
| Category | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| instructions | 0.931 | 0.460 | 0.616 | 1153 |
| appointment | 0.905 | 0.682 | 0.778 | 660 |
| medication | 0.354 | 0.707 | 0.471 | 239 |
| lab | 0.740 | 0.689 | 0.714 | 132 |
| procedure | 0.857 | 0.343 | 0.490 | 35 |
| imaging | 0.667 | 0.811 | 0.732 | 37 |
| other | 0.294 | 0.127 | 0.177 | 79 |
Key Findings
Strengths:
- High JSON validity (97.5%) - reliable structured output
- Strong precision on
instructions(93.1%) andappointment(90.5%) - Best F1 on
appointment(0.778) andimaging(0.732)
Limitations:
- Lower recall on
instructions(46.0%) andprocedure(34.3%) - Weak performance on
othercategory (F1=0.177)
Training procedure
This model was trained with SFT on the chrisvoncsefalvay/smol-discharge-sentences-sft dataset.
- Training samples: 25,782
- Validation samples: 5,142
- Epochs: 3
- Effective batch size: 16
- Learning rate: 5e-5
- LoRA rank: 64
Framework versions
- TRL: 0.25.1
- Transformers: 4.57.3
- Pytorch: 2.9.1
- Datasets: 4.4.1
- Tokenizers: 0.22.1
Intended Use
This model is intended for research purposes in clinical NLP, specifically for:
- Identifying follow-up action items in discharge summaries
- Structured extraction of patient instructions
- Clinical document analysis pipelines
Limitations
- Trained on MIMIC-III data (US hospital system, English only)
- May not generalize to other clinical contexts or languages
- Should not be used for clinical decision-making without human review
Citations
Cite TRL as:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for chrisvoncsefalvay/smollm3-discharge-sentences-sft
Base model
HuggingFaceTB/SmolLM3-3B-Base