Model Card for biogpt-bioqa-8bit-openvino

Model Description

This model, biogpt-bioqa-8bit-openvino, is an optimized 8-bit quantized version of the kirubel1738/biogpt-bioqa-lora-merged model, specifically designed for efficient CPU inference using OpenVINO. It inherits the specialized biomedical question-answering capabilities of the original model while providing significantly faster inference speeds and reduced memory footprint.

The model combines the extensive biomedical knowledge of Microsoft's BioGPT-Large with targeted fine-tuning using Low-Rank Adaptation (LoRA) on a comprehensive biomedical QA dataset, then optimizes it through 8-bit quantization and OpenVINO runtime compilation for production-ready CPU deployment.

  • Developed by: kirubel1738
  • Shared by: kirubel1738
  • Model type: Causal Language Model (Text Generation / Question Answering)
  • Language(s) (NLP): English
  • License: Apache 2.0
  • Finetuned from model: microsoft/BioGPT-Large
  • Quantized from model: kirubel1738/biogpt-bioqa-lora-merged

Model Sources

Uses

Direct Use

This model is optimized for direct use in biomedical question-answering applications where CPU inference is preferred or required. It's ideal for:

  • Biomedical research assistance and literature review
  • Educational tools for biology and medical students
  • Clinical decision support systems (with appropriate validation)
  • Bioinformatics pipelines requiring efficient text generation
  • Resource-constrained environments without GPU access

Downstream Use

The model can be integrated into:

  • Healthcare chatbots and virtual assistants
  • Scientific literature summarization systems
  • Drug discovery and pharmacological research tools
  • Academic research platforms
  • Biomedical tutoring systems

Out-of-Scope Use

  • Real-time clinical diagnosis without human oversight
  • Generating medical advice for individual patients
  • High-stakes decision making without verification
  • Non-biomedical domains
  • Tasks requiring extremely low latency (<100ms) on very old CPUs

Bias, Risks, and Limitations

Limitations:

  • 8-bit quantization may cause minor accuracy degradation compared to the original FP16 model
  • Model may generate verbose or overly technical responses for simple questions
  • Limited to biomedical domain knowledge; performance on general topics is reduced

Risks:

  • Potential for generating plausible but incorrect biomedical information
  • May reflect biases present in the training data
  • Should not be used as a sole source for medical decisions

Recommendations

Users should:

  • Verify critical biomedical information from authoritative sources
  • Use appropriate prompts and temperature settings for desired response style
  • Consider the quantized nature when evaluating response quality
  • Test performance on specific use cases before deployment

How to Get Started with the Model

Installation

pip install "optimum[openvino]" transformers

from optimum.intel import OVModelForCausalLM
from transformers import AutoTokenizer
import time

MODEL_ID = "kirubel1738/biogpt-bioqa-8bit-openvino"

# Load with lean configuration for minimal memory usage
lean_config = {
    "PERFORMANCE_HINT": "LATENCY",
    "ENABLE_MMAP": "YES",
    "CACHE_DIR": "",
}

model = OVModelForCausalLM.from_pretrained(
    MODEL_ID,
    ov_config=lean_config,
    use_cache=True
)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)

# Generate response
question = "What is the function of the p53 gene?"
inputs = tokenizer(question, return_tensors="pt")

start_time = time.time()
outputs = model.generate(**inputs, max_new_tokens=100)
end_time = time.time()

answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"Q: {question}")
print(f"A: {answer[len(question):].strip()}")
print(f"⏱️ Time taken: {end_time - start_time:.2f} seconds")
Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kirubel1738/biogpt-bioqa-8bit-openvino

Finetuned
(8)
this model

Dataset used to train kirubel1738/biogpt-bioqa-8bit-openvino