GPT-Neo 125M Medical Instruction-Tuned Model

Model Overview

This model is an instruction-conditioned medical text generator built on top of EleutherAI/gpt-neo-125M.

It was fine-tuned using LoRA (Low-Rank Adaptation) with prompt-formatted biomedical abstracts from the PubMed Summarization dataset.

Unlike standard fine-tuned models, this version was trained using structured prompts to improve domain-specific generation quality.

Key Improvements

Compared to the base fine-tuned version:

Larger context window (512 tokens)
Instruction-style prompt formatting
Enhanced LoRA configuration (r=16)
Reduced hallucination via controlled decoding
Improved generation coherence for medical narratives

Intended Use

This model is designed for:

Medical text generation
Biomedical explanation drafting
Research prototyping
Educational demonstrations
NLP experimentation in healthcare

⚠️ This model is NOT intended for clinical use.

Training Details

Item	Value
Base Model	EleutherAI/gpt-neo-125M
Dataset	PubMed Summarization
Training Method	LoRA
Prompt Conditioning	Yes
Context Length	512
LoRA Rank	16
Task	Instruction-based Medical Text Generation

Dataset

Training utilized biomedical abstracts from:

https://huggingface.co/datasets/ccdv/pubmed-summarization

Prompt formatting was applied:

Medical report:

This improves alignment with generation tasks rather than summarization.

Training Strategy

Base model weights frozen
LoRA adapters applied to attention layers
Prompt-based conditioning introduced
Controlled decoding parameters used during inference

This enables:

Efficient training
Low memory footprint
Domain-aligned generation

LoRA Paper: https://arxiv.org/abs/2106.09685

Limitations

May generate plausible but incorrect medical statements
Not trained on clinical decision datasets
May struggle with rare diseases
No real-time knowledge updates

Users must verify outputs using trusted medical sources.

Ethical Considerations

Allowed Uses:

✔️ Research
✔️ Academic projects
✔️ NLP experimentation

Disallowed Uses:

❌ Clinical decision support
❌ Medical diagnosis
❌ Treatment planning
❌ Emergency healthcare guidance

This model does not replace medical professionals.

Future Work

Add ROUGE / BLEU evaluation
Compare against BioGPT / ClinicalT5
Improve safety alignment
Add hallucination detection layer
Extend to clinical-style datasets

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("omniamagdy/gptneo-medical-125m")
tokenizer = AutoTokenizer.from_pretrained("omniamagdy/gptneo-medical-125m")

prompt = "Medical report:\nExplain hypertension"

inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    max_new_tokens=120,
    temperature=0.6,
    top_p=0.9,
    repetition_penalty=1.2
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: -

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for omniamagdy/gptneo-medical-125m

Base model

EleutherAI/gpt-neo-125m

Adapter

(105)

this model

Dataset used to train omniamagdy/gptneo-medical-125m

Paper for omniamagdy/gptneo-medical-125m

LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 60