Clinical Trial Endpoint Classifier — 0.8B (Qwen3.5-0.8B LoRA)

A fine-tuned LoRA adapter on Qwen3.5-0.8B for extracting and classifying clinical trial endpoints from outcome text. Returns structured JSON with standardized endpoint names, measurement types, methods, and more.

See also: 4B model achieves lower loss (0.485 vs 0.617) with better accuracy on complex endpoints.

Output Format

{
  "endpoints": [
    {
      "endpoint_name_standardized": "Objective Response Rate",
      "measurement_of": "tumor response",
      "measurement_type": "binary",
      "metric_type": "proportion",
      "timeframe": "Week 24",
      "measurement_method": "RECIST v1.1",
      "evaluation_criteria": "CR or PR",
      "unit": "%",
      "population": null,
      "is_composite": false,
      "components": []
    }
  ]
}

Field Definitions

Field	Description	Examples
`endpoint_name_standardized`	Standardized endpoint name	"Overall Survival", "HbA1c", "PASI 75 Response Rate"
`measurement_of`	What is being measured	"tumor response", "glycated hemoglobin", "psoriasis severity"
`measurement_type`	Type of measurement	`continuous`, `binary`, `ordinal`, `time-to-event`
`metric_type`	Statistical metric	`mean`, `proportion`, `hazard ratio`, `change from baseline`, `count`
`timeframe`	When measurement occurs	"Week 12", "baseline to 6 months", "Up to 36 months"
`measurement_method`	How it is measured	"blood test", "RECIST v1.1", "12-lead ECG"
`evaluation_criteria`	Criteria for evaluation	"PASI 75", "CR or PR", "clinically meaningful"
`unit`	Unit of measurement	"%", "mg/dL", "mm", "count"
`population`	Specific population	"adults aged 18-65", "in Italy only"
`is_composite`	Whether composite endpoint	`true` / `false`
`components`	Components if composite	`["blood pressure", "heart rate", "temperature"]`

Supports multiple endpoints from a single text.

Training Details


Base model	Qwen/Qwen3.5-0.8B
Method	LoRA (bf16, rank 16, alpha 16)
Training data	1,948 samples (3,607 endpoints)
Epochs	3
Final loss	0.617
Training time	34 min on RTX 4090
Framework	Unsloth + TRL SFTTrainer

Usage

import json
from unsloth import FastLanguageModel
from transformers import AutoTokenizer
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Shubh-0789/endpoint-qwen3.5-0.8b-lora",
    max_seq_length=2048,
    load_in_4bit=False,
    load_in_16bit=True,
    dtype=torch.bfloat16,
)
text_tokenizer = AutoTokenizer.from_pretrained("Shubh-0789/endpoint-qwen3.5-0.8b-lora")
FastLanguageModel.for_inference(model)
model.generation_config.pad_token_id = text_tokenizer.pad_token_id

clinical_text = "Overall Survival | Time from randomization to death from any cause | [Time Frame: Up to 5 years]"

messages = [
    {"role": "user", "content": f"Extract and classify the clinical trial endpoint from the following text. Return ONLY a JSON.\nText: {clinical_text}"}
]

inputs = text_tokenizer.apply_chat_template(
    messages, tokenize=True, add_generation_prompt=True,
    return_tensors="pt", return_dict=True,
).to(model.device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.1, do_sample=True)

result = text_tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
endpoints = json.loads(result)
print(json.dumps(endpoints, indent=2))

With PEFT/Transformers

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-0.8B", torch_dtype="bfloat16", device_map="auto")
model = PeftModel.from_pretrained(base_model, "Shubh-0789/endpoint-qwen3.5-0.8b-lora")
tokenizer = AutoTokenizer.from_pretrained("Shubh-0789/endpoint-qwen3.5-0.8b-lora")

Examples

Single endpoint:

Input: "HbA1c change from baseline at Week 24"
Output: {"endpoints": [{"endpoint_name_standardized": "HbA1c", "measurement_of": "glycated hemoglobin", "measurement_type": "continuous", "metric_type": "change from baseline", "timeframe": "Week 24", ...}]}

Multiple endpoints:

Input: "Safety endpoints: AEs, laboratory safety (hematology, chemistry), vital signs, ECG"
Output: {"endpoints": [{"endpoint_name_standardized": "Adverse Events", ...}, {"endpoint_name_standardized": "Laboratory Safety", "is_composite": true, "components": ["hematology", "chemistry"], ...}, ...]}

Model Comparison

Model	Parameters	Loss	Speed	Link
0.8B	856M	0.617	Fast	This model
4B	4.6B	0.485	Moderate	4B

Limitations

Trained on English clinical trial text only
Complex composite endpoints may need verification
Minimum inference: any GPU with 3GB+ VRAM

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Shubh-0789/endpoint-qwen3.5-0.8b-lora

Base model

Qwen/Qwen3.5-0.8B-Base

Finetuned

Qwen/Qwen3.5-0.8B

Adapter

(73)

this model