π©Έ Medical CoT Hematology β Llama-3.1-8B-DoRA
A DoRA (Weight-Decomposed Low-Rank Adaptation) adapter that teaches Llama-3.1-8B-Instruct to perform step-by-step clinical reasoning in hematology and blood transfusion medicine, distilled from a larger teacher model.
β¨ Highlights
- Chain-of-Thought distillation from DeepSeek-R1-Distill-Llama-8B teacher
- Structured reasoning β model generates
<think>tags with step-by-step clinical logic - Modern medical knowledge β recommends MMA/homocysteine over obsolete Schilling test
- Clinically accurate thresholds β uses realistic lab ranges (e.g., ferritin cutoffs for IDA vs ACD)
- Trained in ~37 minutes on a single A100 80 GB GPU
- Only 4.22% of parameters are trainable (353M / 8.4B)
ποΈ Architecture
Teacher: DeepSeek-R1-Distill-Llama-8B (4-bit)
β generates CoT reasoning for 1,101 medical questions
βΌ
Training Data: question β <think>reasoning</think> β answer
β
βΌ
Student: Llama-3.1-8B-Instruct + DoRA adapter (4-bit QLoRA)
β learns to reproduce the teacher's reasoning patterns
βΌ
Output: Clinically-reasoned hematology answers
π Training Details
| Parameter | Value |
|---|---|
| Base model | meta-llama/Llama-3.1-8B-Instruct |
| Teacher model | deepseek-ai/DeepSeek-R1-Distill-Llama-8B |
| PEFT method | DoRA (Weight-Decomposed LoRA) |
| LoRA rank (r) | 128 |
| LoRA alpha | 256 |
| Target modules | q, k, v, o, gate, up, down proj + lm_head |
| Quantization | QLoRA 4-bit NF4, double quantization |
| Epochs | 3 |
| Batch size | 8 Γ 4 gradient accumulation = 32 effective |
| Learning rate | 2e-4 (cosine schedule, 5% warmup) |
| Optimizer | paged AdamW 32-bit |
| Max sequence length | 4,096 tokens |
| Trainable params | 353M / 8.4B total (4.22%) |
| Training time | ~37 minutes, 99 steps |
| Hardware | Google Colab A100 80 GB (~18.5 GB VRAM used) |
| Precision | bf16 + tf32 |
π¦ Training Dataset
1,101 teacher-generated CoT reasoning samples from three sources:
| Source | Samples | Weight | Description |
|---|---|---|---|
| MedQA (USMLE) | ~1,000 | 0.50 | USMLE-style clinical vignettes |
| Hematology Corpus | ~51 | 0.25 | QA from hematology textbooks/PDFs |
| PubMedQA | ~50 | 0.10 | Research-based biomedical questions |
Train/Val split: 1,046 / 55 samples
π¬ Qualitative Evaluation: Base vs Fine-Tuned
Three hematology questions were tested on both the base Llama-3.1-8B and this fine-tuned model.
Q1: Megaloblastic Anemia Workup
65-year-old woman with fatigue, pallor, Hb 7.2, MCV 110, hypersegmented neutrophils
| Base Model | Fine-Tuned (CoT) | |
|---|---|---|
| Key recommendation | Schilling test (β obsolete) | MMA + Homocysteine (β modern gold standard) |
| Reasoning structure | Flat list | Organized: Symptoms β Causes β Workup |
| Clinical nuance | Generic causes | Age-contextualized causes |
Q2: Acute Hemolytic Transfusion Reaction
Fever, flank pain, dark urine 30 min after starting transfusion
| Base Model | Fine-Tuned (CoT) | |
|---|---|---|
| Diagnosis | HTR (redundant phrasing) | AHTR (correct terminology) |
| Management | Suggests diuretics (outdated) | Prioritizes IV hydration + DAT (modern) |
| Added value | Protocol list | Brief pathophysiology + specialist consult |
Q3: IDA vs Anemia of Chronic Disease
Compare ferritin, TIBC, serum iron differences
| Base Model | Fine-Tuned (CoT) | |
|---|---|---|
| Ferritin thresholds | ACD >300 (too rigid) | ACD >100 (clinically accurate) |
| Serum iron in ACD | Always low (incorrect) | Variable β can be low/normal/elevated (correct) |
| Mechanism | Mentions hepcidin | Explains hepcidin interplay more thoroughly |
Overall Comparison
| Feature | Base Model | Fine-Tuned (CoT) |
|---|---|---|
| Reasoning structure | βββ Adequate | βββββ Systematic step-by-step |
| Clinical accuracy | βββ Some outdated info | ββββ Modern guidelines |
| Diagnostic thresholds | βββ Generic textbook | ββββ Clinically realistic |
| Safety / relevance | βββ Good but textbook-heavy | ββββ Focused on clinical priorities |
| Teaching value | βββ Answers the question | βββββ Explains why before what |
π Usage
Load with PEFT (adapter only β recommended)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
bnb = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_quant_type='nf4',
bnb_4bit_use_double_quant=True,
)
base = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.1-8B-Instruct",
quantization_config=bnb, device_map='auto',
)
model = PeftModel.from_pretrained(base, "taksa1990/Medical-CoT-Hematology-Llama3.1-8B-DoRA")
model = model.merge_and_unload() # merge for clean generation
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
messages = [
{"role": "system", "content": "You are a specialist hematology assistant. Provide step-by-step clinical reasoning inside <think> tags, then give the final answer."},
{"role": "user", "content": "What are the lab findings in iron deficiency anemia vs thalassemia trait?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=1024, temperature=0.6, top_p=0.9, do_sample=True)
print(tokenizer.decode(out[0][inputs['input_ids'].shape[-1]:], skip_special_tokens=True))
π― Intended Use
- Medical education β study tool for hematology reasoning
- Research β exploring CoT distillation for clinical NLP
- Prototyping β clinical decision support system proof-of-concept
β οΈ Not for clinical use. This model is a research prototype and has not been validated for real-world medical decision-making. Always consult qualified healthcare professionals for medical advice.
π Citation
@misc{medical-cot-hematology-2026,
title={Medical Chain-of-Thought Distillation for Hematology},
author={Taher Akbari Saeed},
year={2026},
url={https://huggingface.co/taksa1990/Medical-CoT-Hematology-Llama3.1-8B-DoRA},
note={DoRA adapter distilled from DeepSeek-R1-Distill-Llama-8B into Llama-3.1-8B-Instruct}
}
π Acknowledgments
- Meta AI β Llama-3.1-8B-Instruct base model
- DeepSeek β R1-Distill-Llama-8B teacher model
- Hugging Face β transformers, peft, trl, datasets libraries
- Google Colab β A100 GPU compute
π€ Author & Contact
Taher Akbari Saeed Postgraduate Student in Hematology and Blood Transfusion Department of Oncology, Hematology, and Radiotherapy Institute of Postgraduate Education, Pirogov Russian National Research Medical University (RNRMU), Russia
| π§ Email | taherakbarisaeed@gmail.com |
| π GitHub | tayden1990 |
| π¬ Telegram | @tayden2023 |
| π ORCID | 0000-0002-9517-9773 |
| π€ HuggingFace | taksa1990 |
- Downloads last month
- -
Model tree for taksa1990/Medical-CoT-Hematology-Llama3.1-8B-DoRA
Base model
meta-llama/Llama-3.1-8B