Gemma 4 Clinical Trial Endpoint Extractor
A fine-tuned Gemma 4 E4B-it model (LoRA adapter) for extracting structured endpoint information from clinical trial text. The model takes unstructured endpoint descriptions and outputs structured JSON following a predefined schema.
Model Details
| Property | Value |
|---|---|
| Base Model | google/gemma-4-E4B-it |
| Method | QLoRA (4-bit NF4 quantization + LoRA rank 16) |
| Trainable Parameters | 42.4M (0.85% of 5B total) |
| Training Data | 1,558 clinical trial endpoint samples |
| Data Sources | ClinicalTrials.gov, EU CTR (EudraCT), ChiCTR (Chinese Clinical Trials) |
| Final Eval Loss | 0.3006 |
| Final Token Accuracy | 94.07% |
| Training Time | ~74 minutes on NVIDIA RTX A6000 (48GB) |
| License | Apache 2.0 |
Training Results
| Epoch | Eval Loss | Token Accuracy |
|---|---|---|
| 1 | 0.3493 | 93.23% |
| 2 | 0.3024 | 94.01% |
| 3 | 0.3006 | 94.07% |
Output Schema
The model outputs JSON with the following structure:
{
"endpoints": [
{
"endpoint_name_standardized": "string | null",
"measurement_of": "string | null",
"measurement_type": "continuous | binary | time-to-event | ordinal | null",
"metric_type": "string | null",
"timeframe": "string | null",
"measurement_method": "string | null",
"evaluation_criteria": "string | null",
"unit": "string | null",
"population": "string | null",
"is_composite": "boolean",
"components": "[]"
}
]
}
Field Definitions
- endpoint_name_standardized: Normalized endpoint name (e.g., ORR -> Objective Response Rate)
- measurement_of: Underlying clinical concept being measured
- measurement_type: One of
continuous,binary,time-to-event,ordinal - metric_type: Statistical metric (mean, median, proportion, hazard ratio, etc.)
- timeframe: Assessment window (e.g., "24 weeks", "Up to 5 years")
- measurement_method: Physical tool/technique (CT scan, MRI, blood test) - NOT scoring systems
- evaluation_criteria: Scoring system/guideline (RECIST v1.1, WHO criteria) - NOT measurement tools
- unit: Measurement unit (%, mL, ms, ng/mL, etc.)
- population: Study population if specified (ITT, safety population, etc.)
- is_composite: Whether endpoint combines multiple events
- components: List of component events for composite endpoints
Usage
With PEFT + Transformers
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
# IMPORTANT: Monkey-patch for Gemma 4 compatibility with PEFT
import torch.nn as nn
from transformers.models.gemma4 import modeling_gemma4
class PatchedClippableLinear(nn.Linear):
def __init__(self, config, in_features, out_features):
nn.Linear.__init__(self, in_features, out_features, bias=False)
self.use_clipped_linears = getattr(config, "use_clipped_linears", False)
if self.use_clipped_linears:
self.register_buffer("input_min", torch.tensor(-float("inf")))
self.register_buffer("input_max", torch.tensor(float("inf")))
self.register_buffer("output_min", torch.tensor(-float("inf")))
self.register_buffer("output_max", torch.tensor(float("inf")))
def forward(self, x):
if self.use_clipped_linears:
x = torch.clamp(x, self.input_min, self.input_max)
out = nn.Linear.forward(self, x)
if self.use_clipped_linears:
out = torch.clamp(out, self.output_min, self.output_max)
return out
modeling_gemma4.Gemma4ClippableLinear = PatchedClippableLinear
# Load base model in 4-bit
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
base_model = AutoModelForCausalLM.from_pretrained(
"google/gemma-4-E4B-it",
quantization_config=quantization_config,
device_map="auto",
dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-E4B-it")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Shubh-0789/gemma4-clinical-endpoint-extractor")
model.eval()
# Prepare input
system_prompt = "You are a clinical trial endpoint extraction system. Extract structured endpoint information from clinical trial text. Return ONLY valid JSON."
endpoint_text = "Progression-Free Survival (PFS) assessed by RECIST v1.1 using CT scan at 24 weeks"
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": f"Extract all endpoint fields from this clinical trial text. Return ONLY a JSON.\n\nText:\n{endpoint_text}\n\nOutput (JSON only):"},
]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
Example Outputs
Single Endpoint
Input: Number of participants achieving CR and PR by 4 months as per IWG 2023 criteria | [Time Frame: Through 4 months after starting treatment]
Output:
{
"endpoints": [
{
"endpoint_name_standardized": "Number of participants achieving CR and PR",
"measurement_of": "Objective Response Rate",
"measurement_type": "binary",
"metric_type": "count",
"timeframe": "Through 4 months after starting treatment",
"measurement_method": null,
"evaluation_criteria": "IWG 2023 criteria",
"unit": "number",
"population": null,
"is_composite": true,
"components": ["CR", "PR"]
}
]
}
Composite Biomarker Endpoint
Input: Biomarkers (Phase II) | Integrated biomarker endpoints include: PTEN immunohistochemistry, estrogen receptor and progesterone receptor, whole exome sequencing, ribonucleic acid sequencing. | [Time Frame: Up to 5 years]
Output:
{
"endpoints": [
{
"endpoint_name_standardized": "Integrated Biomarker Association with PFS",
"measurement_of": "prognostic association of integrated biomarkers with PFS",
"measurement_type": "continuous",
"metric_type": "hazard ratio",
"timeframe": "Up to 5 years",
"measurement_method": "Integrated biomarker analysis including PTEN immunohistochemistry, estrogen receptor and progesterone receptor, whole exome sequencing, ribonucleic acid sequencing",
"evaluation_criteria": "Proportional hazards models",
"unit": null,
"population": null,
"is_composite": true,
"components": ["PTEN immunohistochemistry", "estrogen receptor", "progesterone receptor", "whole exome sequencing", "ribonucleic acid sequencing"]
}
]
}
Minimal Input
Input: Body height
Output:
{
"endpoints": [
{
"endpoint_name_standardized": "Body height",
"measurement_of": "height",
"measurement_type": "continuous",
"metric_type": null,
"timeframe": null,
"measurement_method": "Stadiometer",
"evaluation_criteria": null,
"unit": "cm",
"population": null,
"is_composite": false,
"components": []
}
]
}
Training Configuration
| Parameter | Value |
|---|---|
| LoRA Rank (r) | 16 |
| LoRA Alpha | 16 |
| LoRA Dropout | 0.05 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Learning Rate | 5e-5 |
| LR Scheduler | Cosine |
| Epochs | 3 |
| Batch Size | 1 (x4 gradient accumulation) |
| Max Sequence Length | 2048 |
| Optimizer | AdamW 8-bit |
| Quantization | 4-bit NF4 with double quantization |
| Gradient Checkpointing | Enabled |
| Warmup Steps | 50 |
| Total Steps | 1,170 |
Training Data
The model was trained on 1,558 clinical trial endpoint samples (with 195 validation) sourced from three international registries:
| Source | Endpoints |
|---|---|
| ClinicalTrials.gov (USA) | ~1,100 |
| ChiCTR (China) | ~500 |
| EU CTR (Europe) | ~350 |
Ground-truth labels were generated using Qwen 3.6-plus and the dataset covers:
- Single and multiple endpoint extraction
- Composite endpoint detection (19% of samples)
- Various measurement types (continuous, binary, time-to-event, ordinal)
- Multiple therapeutic areas (oncology, cardiology, neurology, etc.)
Hardware Requirements
| Setup | VRAM |
|---|---|
| QLoRA inference (4-bit) | ~10 GB |
| QLoRA training | ~28 GB |
Limitations
- Ground truth labels were generated by an LLM (Qwen 3.6-plus), not manually annotated
- May occasionally hallucinate a second endpoint for single-endpoint inputs
- measurement_method may be inferred even when not explicitly stated in the text
- Primarily trained on English-language endpoint descriptions
Citation
If you use this model, please cite:
@misc{gemma4-clinical-endpoint-extractor,
title={Gemma 4 Clinical Trial Endpoint Extractor},
author={Shubhanshu Yadav},
year={2026},
publisher={Hugging Face},
url={https://huggingface.co/Shubh-0789/gemma4-clinical-endpoint-extractor}
}
- Downloads last month
- 48
Model tree for Shubh-0789/gemma4-clinical-endpoint-extractor
Base model
google/gemma-4-E4B-itEvaluation results
- Eval Loss (Epoch 3)self-reported0.301
- Token Accuracy (Epoch 3)self-reported0.941