RitanshuPatel/triageai-mistral

LoRA fine-tuned Mistral-7B for clinical notes extraction β€” converts messy doctor notes into structured JSON with consistent accuracy.


The Problem We Are Solving

Every day, doctors and clinicians write hundreds of clinical notes that look like this:

pt 67M c/o CP x2hr rad to L arm, diaphoretic, SOB+, PMH DM2 HTN, 
smoker 20pk/yr, meds metformin 500 BID lisinopril 10 QD atorvastatin 
40 QHS, EKG ST elev V2-V4, trop pnd, A: r/o STEMI, allerg PCN hives

This is the reality of clinical documentation β€” abbreviated, unstructured, and nearly unreadable to anyone outside the medical field.

The core problems this creates:

  • Abbreviations are ambiguous β€” CP can mean chest pain, cerebral palsy, or care plan depending on context. Base LLMs frequently get these wrong.

  • Inconsistent JSON output β€” When you ask a base LLM to extract structured data from clinical notes, it returns different JSON structures every time. Sometimes conditions is a list of strings, sometimes a list of objects, sometimes nested. This breaks downstream applications.

  • Missing critical fields β€” Base models skip medications, miss vitals, or hallucinate dosages that were never in the original note.

  • Medical abbreviation blindness β€” Standard LLMs are not trained heavily on clinical shorthand. BID, QD, TID, PRN, c/o, PMH, r/o β€” these require specialized understanding.

  • Time cost β€” A clinician manually transcribing and structuring one note takes 5-10 minutes. Multiply that by 50 patients a day and you have a significant administrative burden that takes time away from patient care.

The goal of this model is simple: paste any messy clinical note, get back perfectly structured JSON every single time.


Why This Model is Outstanding

1. Trained on Real De-identified Clinical Notes

This model was fine-tuned on MTSamples β€” one of the most widely used datasets of real de-identified medical transcriptions covering 40+ medical specialties including cardiology, orthopedics, neurology, psychiatry, and general surgery. The training data reflects real-world clinical language, not synthetic examples.

2. Consistent JSON Structure Every Time

The biggest failure mode of base LLMs for clinical extraction is inconsistent output structure. After fine-tuning, this model returns the exact same JSON schema on every call:

{
  "patient": {"age": 67, "gender": "male"},
  "chief_complaint": "chest pain",
  "conditions": ["Hypertension", "Type 2 Diabetes"],
  "medications": [{"name": "metformin", "dose": "500mg", "frequency": "twice daily"}],
  "vitals": {"bp": "158/94", "hr": "102", "rr": null, "o2_sat": "94%"},
  "allergies": ["PCN"],
  "plan": []
}

No objects where strings are expected. No nested arrays. No missing fields. This consistency is what makes it production-ready for downstream applications.

3. Medical Abbreviation Mastery

Fine-tuning on clinical notes taught the model to correctly expand and interpret abbreviations that trip up base models:

Abbreviation Correct Expansion
BID twice daily
c/o complains of
PMH past medical history
r/o rule out
HTN Hypertension
DM2 Type 2 Diabetes
SOB shortness of breath
STAT immediately

4. Multi-Specialty Coverage

Training examples were carefully selected across 10 medical specialties to ensure the model generalizes well beyond a single domain:

  • Cardiovascular / Cardiology
  • Orthopedics / Surgery
  • Neurology
  • Endocrinology
  • Pulmonology
  • Psychiatry
  • Gastroenterology
  • Urology
  • General Medicine
  • Emergency Medicine

5. Efficient 4-bit Quantization

Trained using QLoRA (4-bit quantization) via BitsAndBytes β€” the model runs efficiently without requiring expensive GPU hardware. The LoRA adapter is only 13.6MB while the base model remains unchanged, making it easy to deploy anywhere Mistral-7B runs.

6. Genuine Learning β€” Not Just Memorization

Training loss dropped from 1.74 β†’ 1.26 over 3 epochs, demonstrating genuine generalization to unseen clinical notes rather than memorizing training examples.


Model Details

Property Value
Base Model mistralai/Mistral-7B-Instruct-v0.3
Fine-tuning Method LoRA (Low-Rank Adaptation)
Quantization QLoRA 4-bit (BitsAndBytes)
Adapter Size 13.6 MB
Training Platform Google Colab T4 GPU (free tier)
Trainable Parameters ~0.5% of total parameters
Task Clinical notes β†’ Structured JSON
Language English
License MIT

Training Data

Property Value
Source MTSamples (real de-identified clinical transcriptions)
Training Examples 49 carefully curated examples
Specialties 10 medical specialties
Format JSONL instruction format (system / user / assistant)
Label Generation llama-3.3-70b-versatile for high quality labels
Max Sequence Length 1024 tokens

Training Configuration

LoraConfig(
    r=16,                          # LoRA rank
    lora_alpha=32,                 # LoRA alpha
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.1,
    bias="none",
    task_type="CAUSAL_LM"
)

TrainingArguments(
    num_train_epochs=3,
    per_device_train_batch_size=1,
    gradient_accumulation_steps=8,
    optim="paged_adamw_8bit",
    learning_rate=2e-4,
    fp16=False,
    bf16=False,
)

Training Loss Curve:

  • Epoch 1: ~1.74
  • Epoch 2: ~1.45
  • Epoch 3: ~1.26

How to Use

Load the adapter

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

# Load base model with 4-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16
)

base_model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.3",
    quantization_config=bnb_config,
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.3"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(
    base_model,
    "RitanshuPatel/triageai-mistral"
)

Run inference

SYSTEM_PROMPT = """You are a clinical notes parser.
Extract structured JSON from medical notes.
Return ONLY valid JSON with these fields:
{
  "patient": {"age": null, "gender": null},
  "chief_complaint": "",
  "conditions": [],
  "medications": [{"name": "", "dose": "", "frequency": ""}],
  "vitals": {"bp": null, "hr": null, "rr": null, "o2_sat": null},
  "allergies": [],
  "plan": []
}
Conditions and allergies must be plain strings. Never hallucinate."""

note = "pt 67M c/o CP x2hr, PMH DM2 HTN, metformin 500 BID lisinopril 10 QD, allerg PCN"

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": note}
]

inputs = tokenizer.apply_chat_template(
    messages,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=500,
    temperature=0.1,
    do_sample=True
)

result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)

Performance Comparison

Base Mistral-7B output (inconsistent)

{
  "patient_info": {"age": "67 years old", "sex": "male"},
  "presenting_complaint": "chest pain for 2 hours",
  "past_history": [{"condition": "DM2"}, {"condition": "HTN"}],
  "current_medications": "metformin 500mg BID, lisinopril 10mg QD",
  "drug_allergies": [{"allergy": "PCN", "reaction": "unknown"}]
}

❌ Wrong field names, nested objects instead of strings, age as string not number

Fine-tuned output (consistent)

{
  "patient": {"age": 67, "gender": "male"},
  "chief_complaint": "chest pain",
  "conditions": ["Hypertension", "Type 2 Diabetes"],
  "medications": [
    {"name": "metformin", "dose": "500mg", "frequency": "twice daily"},
    {"name": "lisinopril", "dose": "10mg", "frequency": "once daily"}
  ],
  "vitals": {"bp": null, "hr": null, "rr": null, "o2_sat": null},
  "allergies": ["PCN"],
  "plan": []
}

βœ… Correct schema, strings where expected, age as number, abbreviations expanded


Intended Use & Medical Disclaimer

Intended for:

  • Clinical NLP research and experimentation
  • Building medical documentation tools
  • Educational purposes
  • Portfolio and demonstration projects

Not intended for:

  • Direct clinical decision making
  • Replacing qualified medical professionals
  • Real patient care without human review

⚠️ Medical Disclaimer: This model is for informational and research purposes only. It is not a substitute for professional medical advice, diagnosis, or treatment. All model outputs must be reviewed by qualified healthcare professionals before any clinical use.


Part of the TriageAI Project

This model is the fine-tuned component of TriageAI β€” a full-stack clinical notes summarization agent.

Resource Link
🌐 Live Demo triageai-ritanshupatel.vercel.app
πŸ’» GitHub github.com/RitanshuPatelMMR/Triage-ai
πŸ€— HuggingFace Space RitanshuPatel/triageai-backend

Full TriageAI Stack:

  • LangGraph 5-node autonomous agent
  • RAG with FAISS over 82,000+ medical knowledge vectors
  • Real-time SSE streaming
  • Groq Vision for handwritten note OCR
  • OpenFDA drug interaction checking
  • React + FastAPI frontend

Built with ❀️ using Google Colab free T4 GPU, HuggingFace PEFT, and MTSamples dataset.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for RitanshuPatel/triageai-mistral

Adapter
(873)
this model

Spaces using RitanshuPatel/triageai-mistral 2