RitanshuPatel/triageai-mistral
LoRA fine-tuned Mistral-7B for clinical notes extraction β converts messy doctor notes into structured JSON with consistent accuracy.
The Problem We Are Solving
Every day, doctors and clinicians write hundreds of clinical notes that look like this:
pt 67M c/o CP x2hr rad to L arm, diaphoretic, SOB+, PMH DM2 HTN,
smoker 20pk/yr, meds metformin 500 BID lisinopril 10 QD atorvastatin
40 QHS, EKG ST elev V2-V4, trop pnd, A: r/o STEMI, allerg PCN hives
This is the reality of clinical documentation β abbreviated, unstructured, and nearly unreadable to anyone outside the medical field.
The core problems this creates:
Abbreviations are ambiguous β
CPcan mean chest pain, cerebral palsy, or care plan depending on context. Base LLMs frequently get these wrong.Inconsistent JSON output β When you ask a base LLM to extract structured data from clinical notes, it returns different JSON structures every time. Sometimes
conditionsis a list of strings, sometimes a list of objects, sometimes nested. This breaks downstream applications.Missing critical fields β Base models skip medications, miss vitals, or hallucinate dosages that were never in the original note.
Medical abbreviation blindness β Standard LLMs are not trained heavily on clinical shorthand.
BID,QD,TID,PRN,c/o,PMH,r/oβ these require specialized understanding.Time cost β A clinician manually transcribing and structuring one note takes 5-10 minutes. Multiply that by 50 patients a day and you have a significant administrative burden that takes time away from patient care.
The goal of this model is simple: paste any messy clinical note, get back perfectly structured JSON every single time.
Why This Model is Outstanding
1. Trained on Real De-identified Clinical Notes
This model was fine-tuned on MTSamples β one of the most widely used datasets of real de-identified medical transcriptions covering 40+ medical specialties including cardiology, orthopedics, neurology, psychiatry, and general surgery. The training data reflects real-world clinical language, not synthetic examples.
2. Consistent JSON Structure Every Time
The biggest failure mode of base LLMs for clinical extraction is inconsistent output structure. After fine-tuning, this model returns the exact same JSON schema on every call:
{
"patient": {"age": 67, "gender": "male"},
"chief_complaint": "chest pain",
"conditions": ["Hypertension", "Type 2 Diabetes"],
"medications": [{"name": "metformin", "dose": "500mg", "frequency": "twice daily"}],
"vitals": {"bp": "158/94", "hr": "102", "rr": null, "o2_sat": "94%"},
"allergies": ["PCN"],
"plan": []
}
No objects where strings are expected. No nested arrays. No missing fields. This consistency is what makes it production-ready for downstream applications.
3. Medical Abbreviation Mastery
Fine-tuning on clinical notes taught the model to correctly expand and interpret abbreviations that trip up base models:
| Abbreviation | Correct Expansion |
|---|---|
BID |
twice daily |
c/o |
complains of |
PMH |
past medical history |
r/o |
rule out |
HTN |
Hypertension |
DM2 |
Type 2 Diabetes |
SOB |
shortness of breath |
STAT |
immediately |
4. Multi-Specialty Coverage
Training examples were carefully selected across 10 medical specialties to ensure the model generalizes well beyond a single domain:
- Cardiovascular / Cardiology
- Orthopedics / Surgery
- Neurology
- Endocrinology
- Pulmonology
- Psychiatry
- Gastroenterology
- Urology
- General Medicine
- Emergency Medicine
5. Efficient 4-bit Quantization
Trained using QLoRA (4-bit quantization) via BitsAndBytes β the model runs efficiently without requiring expensive GPU hardware. The LoRA adapter is only 13.6MB while the base model remains unchanged, making it easy to deploy anywhere Mistral-7B runs.
6. Genuine Learning β Not Just Memorization
Training loss dropped from 1.74 β 1.26 over 3 epochs, demonstrating genuine generalization to unseen clinical notes rather than memorizing training examples.
Model Details
| Property | Value |
|---|---|
| Base Model | mistralai/Mistral-7B-Instruct-v0.3 |
| Fine-tuning Method | LoRA (Low-Rank Adaptation) |
| Quantization | QLoRA 4-bit (BitsAndBytes) |
| Adapter Size | 13.6 MB |
| Training Platform | Google Colab T4 GPU (free tier) |
| Trainable Parameters | ~0.5% of total parameters |
| Task | Clinical notes β Structured JSON |
| Language | English |
| License | MIT |
Training Data
| Property | Value |
|---|---|
| Source | MTSamples (real de-identified clinical transcriptions) |
| Training Examples | 49 carefully curated examples |
| Specialties | 10 medical specialties |
| Format | JSONL instruction format (system / user / assistant) |
| Label Generation | llama-3.3-70b-versatile for high quality labels |
| Max Sequence Length | 1024 tokens |
Training Configuration
LoraConfig(
r=16, # LoRA rank
lora_alpha=32, # LoRA alpha
target_modules=["q_proj", "v_proj"],
lora_dropout=0.1,
bias="none",
task_type="CAUSAL_LM"
)
TrainingArguments(
num_train_epochs=3,
per_device_train_batch_size=1,
gradient_accumulation_steps=8,
optim="paged_adamw_8bit",
learning_rate=2e-4,
fp16=False,
bf16=False,
)
Training Loss Curve:
- Epoch 1: ~1.74
- Epoch 2: ~1.45
- Epoch 3: ~1.26
How to Use
Load the adapter
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
# Load base model with 4-bit quantization
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16
)
base_model = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.3",
quantization_config=bnb_config,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.3"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(
base_model,
"RitanshuPatel/triageai-mistral"
)
Run inference
SYSTEM_PROMPT = """You are a clinical notes parser.
Extract structured JSON from medical notes.
Return ONLY valid JSON with these fields:
{
"patient": {"age": null, "gender": null},
"chief_complaint": "",
"conditions": [],
"medications": [{"name": "", "dose": "", "frequency": ""}],
"vitals": {"bp": null, "hr": null, "rr": null, "o2_sat": null},
"allergies": [],
"plan": []
}
Conditions and allergies must be plain strings. Never hallucinate."""
note = "pt 67M c/o CP x2hr, PMH DM2 HTN, metformin 500 BID lisinopril 10 QD, allerg PCN"
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": note}
]
inputs = tokenizer.apply_chat_template(
messages,
return_tensors="pt"
).to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=500,
temperature=0.1,
do_sample=True
)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)
Performance Comparison
Base Mistral-7B output (inconsistent)
{
"patient_info": {"age": "67 years old", "sex": "male"},
"presenting_complaint": "chest pain for 2 hours",
"past_history": [{"condition": "DM2"}, {"condition": "HTN"}],
"current_medications": "metformin 500mg BID, lisinopril 10mg QD",
"drug_allergies": [{"allergy": "PCN", "reaction": "unknown"}]
}
β Wrong field names, nested objects instead of strings, age as string not number
Fine-tuned output (consistent)
{
"patient": {"age": 67, "gender": "male"},
"chief_complaint": "chest pain",
"conditions": ["Hypertension", "Type 2 Diabetes"],
"medications": [
{"name": "metformin", "dose": "500mg", "frequency": "twice daily"},
{"name": "lisinopril", "dose": "10mg", "frequency": "once daily"}
],
"vitals": {"bp": null, "hr": null, "rr": null, "o2_sat": null},
"allergies": ["PCN"],
"plan": []
}
β Correct schema, strings where expected, age as number, abbreviations expanded
Intended Use & Medical Disclaimer
Intended for:
- Clinical NLP research and experimentation
- Building medical documentation tools
- Educational purposes
- Portfolio and demonstration projects
Not intended for:
- Direct clinical decision making
- Replacing qualified medical professionals
- Real patient care without human review
β οΈ Medical Disclaimer: This model is for informational and research purposes only. It is not a substitute for professional medical advice, diagnosis, or treatment. All model outputs must be reviewed by qualified healthcare professionals before any clinical use.
Part of the TriageAI Project
This model is the fine-tuned component of TriageAI β a full-stack clinical notes summarization agent.
| Resource | Link |
|---|---|
| π Live Demo | triageai-ritanshupatel.vercel.app |
| π» GitHub | github.com/RitanshuPatelMMR/Triage-ai |
| π€ HuggingFace Space | RitanshuPatel/triageai-backend |
Full TriageAI Stack:
- LangGraph 5-node autonomous agent
- RAG with FAISS over 82,000+ medical knowledge vectors
- Real-time SSE streaming
- Groq Vision for handwritten note OCR
- OpenFDA drug interaction checking
- React + FastAPI frontend
Built with β€οΈ using Google Colab free T4 GPU, HuggingFace PEFT, and MTSamples dataset.
Model tree for RitanshuPatel/triageai-mistral
Base model
mistralai/Mistral-7B-v0.3