Gemma 3 4B — MedMCQA LoRA Adapter

A LoRA fine-tuned adapter for google/gemma-3-4b-it on the MedMCQA dataset — 182K medical multiple-choice questions covering 21 subjects from Indian medical entrance exams (AIIMS/PG style).

Fine-tuned as part of an SUTD Master's deep learning course project comparing zero-shot, LoRA, and full SFT approaches on medical MCQ benchmarks.

Model Details

  • Developed by: James Oon (@jamezoon), SUTD MSTR-DAIE Deep Learning Project
  • Model type: Causal LM with LoRA adapter (PEFT)
  • Base model: google/gemma-3-4b-it (dense, 4B parameters, standard transformer)
  • Language: English
  • License: Follows base model license (Gemma)
  • Adapter size: ~49 MB (adapter_model.safetensors)

Intended Use

Medical multiple-choice question answering. Given a clinical question and 4 options (A–D), the model selects the correct answer with a step-by-step explanation. Subjects covered include Physiology, Anatomy, Biochemistry, Pathology, Pharmacology, Surgery, Medicine, Dental, Gynaecology, Paediatrics, and more.

Not intended for real clinical decision-making. This is a research/educational model.

How to Get Started

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model_id = "google/gemma-3-4b-it"
adapter_id = "jamezoon/gemma-3-4b-it-medmcqa-lora"

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()

messages = [
    {"role": "system", "content": (
        "You are a helpful tutor for pre-med students preparing for medical entrance exams. "
        "Answer the following multiple choice question by thinking step by step, then give the answer."
    )},
    {"role": "user", "content": (
        "Question: Which of the following is the most common cause of mitral stenosis?\n"
        "Options: A. Rheumatic fever  B. Congenital  C. Infective endocarditis  D. SLE\n"
        "Think step by step. Then respond in the format:\n"
        "Explanation: ...\nAnswer: <one of A, B, C, D>"
    )},
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.inference_mode():
    output = model.generate(**inputs, max_new_tokens=256, do_sample=False)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Prompt Format

The model expects chat-template format with the following structure:

System: You are a helpful tutor for pre-med students preparing for medical entrance exams.
        You answer multiple-choice questions with step-by-step reasoning.

User:   Question: {question}
        Options:
        A. {option_a}
        B. {option_b}
        C. {option_c}
        D. {option_d}

        Think step by step. Then respond in the format:
        Explanation: ...
        Answer: <one of A, B, C, D>

Training Details

Dataset

  • MedMCQA — 182,822 training samples, 4,183 validation samples
  • 21 medical subjects (Dental, Surgery, Medicine, Pathology, Pharmacology, etc.)
  • Each sample: question + 4 options + correct answer (1-indexed) + explanation
  • Formatted as chat messages with system/user/assistant roles

Training Procedure

Hyperparameter Value
Training steps 1,200 (max_steps — ~10.5% of 1 full epoch)
Epochs 1 (partial)
Per-device batch size 2
Gradient accumulation 8 (effective batch = 16)
Learning rate 2e-4
LR scheduler Cosine
Warmup steps 200
Max sequence length 512 tokens
Precision BF16
Optimizer AdamW
Gradient checkpointing Enabled

LoRA Configuration

Parameter Value
Rank (r) 16
Alpha (α) 32
Dropout 0.05
Target modules q_proj, k_proj, v_proj, o_proj, out_proj
Trainable parameters 12,894,208 (0.299% of 4.3B)
Bias none

Hardware & Training Time

  • Hardware: NVIDIA GB10 Grace Blackwell (NVIDIA DGX Spark), 121 GB unified CPU+GPU memory
  • Training duration: ~30–40 minutes (1,200 steps)
  • Framework: PyTorch 2.x, HuggingFace Transformers, PEFT 0.18.1, TRL (SFTTrainer)

Architecture Note

google/gemma-3-4b-it is a standard dense transformer with full compatibility with BF16 training and standard CUDA kernels. No special architecture patches were required.

Evaluation

Training Loss Progression

Step Train Loss Token Accuracy
10 5.164 48.3%
50 2.131 61.7%
100 ~1.39 ~73%
200 ~1.27 ~74%
1200 1.27 74.4%

Validation Results (Dev Set, 4,183 samples)

Checkpoint Eval Loss Token Accuracy
Step 200 1.143 77.3%
Step 400 1.125 77.5%
Step 600 1.111 77.7%
Step 800 1.106 77.7%
Step 1000 1.105 77.7%
Step 1200 (best) 1.094 77.8%

Eval loss was still decreasing at step 1200 — the model had not yet fully converged.

MCQ Accuracy Comparison (Dev Split, 4,183 samples)

Model Accuracy Notes
Gemma-3-4B-IT zero-shot TBD Baseline evaluation
Gemma-3-4B-IT + LoRA (this adapter) TBD Evaluation in progress

Comparison with Other MedMCQA Adapters

Qwen3.5-9B adapter Qwen3-14B adapter This adapter (Gemma-3-4B)
Base model params 9B 14B 4B
Architecture Hybrid (GatedDeltaNet) Standard transformer Standard transformer
Trainable params 7.1M (0.079%) 21.0M (0.142%) 12.9M (0.299%)
Best eval loss 0.9669 0.9649 1.094
Best token acc 78.7% 79.20% 77.8%
Adapter size 28 MB 81 MB 49 MB

Citation

If you use this adapter, please cite the MedMCQA dataset:

@inproceedings{pmlr-v174-pal22a,
  title     = {MedMCQA: A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering},
  author    = {Pal, Ankit and Umapathi, Logesh Kumar and Sankarasubbu, Malaikannan},
  booktitle = {Proceedings of the Conference on Health, Inference, and Learning},
  year      = {2022},
  publisher = {PMLR}
}

Framework Versions

  • PEFT 0.18.1
  • Transformers (latest as of March 2026)
  • TRL (SFTTrainer)
  • PyTorch 2.x + CUDA
Downloads last month
65
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jamezoon/gemma-3-4b-it-medmcqa-lora

Adapter
(318)
this model

Dataset used to train jamezoon/gemma-3-4b-it-medmcqa-lora