GPT-Neo 125M Medical Instruction-Tuned Model

Model Overview

This model is an instruction-conditioned medical text generator built on top of EleutherAI/gpt-neo-125M.

It was fine-tuned using LoRA (Low-Rank Adaptation) with prompt-formatted biomedical abstracts from the PubMed Summarization dataset.

Unlike standard fine-tuned models, this version was trained using structured prompts to improve domain-specific generation quality.


Key Improvements

Compared to the base fine-tuned version:

  • Larger context window (512 tokens)
  • Instruction-style prompt formatting
  • Enhanced LoRA configuration (r=16)
  • Reduced hallucination via controlled decoding
  • Improved generation coherence for medical narratives

Intended Use

This model is designed for:

  • Medical text generation
  • Biomedical explanation drafting
  • Research prototyping
  • Educational demonstrations
  • NLP experimentation in healthcare

⚠️ This model is NOT intended for clinical use.


Training Details

Item Value
Base Model EleutherAI/gpt-neo-125M
Dataset PubMed Summarization
Training Method LoRA
Prompt Conditioning Yes
Context Length 512
LoRA Rank 16
Task Instruction-based Medical Text Generation

Dataset

Training utilized biomedical abstracts from:

https://huggingface.co/datasets/ccdv/pubmed-summarization

Prompt formatting was applied:

Medical report:

This improves alignment with generation tasks rather than summarization.


Training Strategy

  • Base model weights frozen
  • LoRA adapters applied to attention layers
  • Prompt-based conditioning introduced
  • Controlled decoding parameters used during inference

This enables:

  • Efficient training
  • Low memory footprint
  • Domain-aligned generation

LoRA Paper: https://arxiv.org/abs/2106.09685


Limitations

  • May generate plausible but incorrect medical statements
  • Not trained on clinical decision datasets
  • May struggle with rare diseases
  • No real-time knowledge updates

Users must verify outputs using trusted medical sources.


Ethical Considerations

Allowed Uses:

✔️ Research
✔️ Academic projects
✔️ NLP experimentation

Disallowed Uses:

❌ Clinical decision support
❌ Medical diagnosis
❌ Treatment planning
❌ Emergency healthcare guidance

This model does not replace medical professionals.


Future Work

  • Add ROUGE / BLEU evaluation
  • Compare against BioGPT / ClinicalT5
  • Improve safety alignment
  • Add hallucination detection layer
  • Extend to clinical-style datasets

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("omniamagdy/gptneo-medical-125m")
tokenizer = AutoTokenizer.from_pretrained("omniamagdy/gptneo-medical-125m")

prompt = "Medical report:\nExplain hypertension"

inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    max_new_tokens=120,
    temperature=0.6,
    top_p=0.9,
    repetition_penalty=1.2
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for omniamagdy/gptneo-medical-125m

Adapter
(105)
this model

Dataset used to train omniamagdy/gptneo-medical-125m

Paper for omniamagdy/gptneo-medical-125m