PaliGemma Leprosy Detector v1

This model is a fine-tuned version of PaliGemma 3B (224px), specifically adapted for the binary classification of Leprosy (Hansen's Disease). It was trained using QLoRA to identify leprosy lesions from dermatological images while distinguishing them from common look-alike conditions.

Model Details

Model Description

The model leverages the multimodal capabilities of PaliGemma, combining a SigLIP vision encoder with a Gemma-2b text decoder. By fine-tuning only the text decoder via LoRA adapters and freezing the vision tower, the model retains its general visual understanding while specializing its clinical reasoning for skin disease diagnosis.

Developed by: barqawiz
Model type: Multimodal Vision-Language Model (VLM)
Task: Binary Image Classification (Leprosy vs. Confounders)
Finetuned from model: google/paligemma-3b-mix-224

Uses

Direct Use

This model is intended for research and educational purposes in the field of tropical dermatology. Users provide a skin lesion image and the standard prompt to receive a "Yes/No" diagnosis.

Prompt Template

For optimal results, use the exact prompt used during training: "<image>Task: Analyze this dermatological image. Is this Leprosy? Answer 'Yes' or 'No'."

How to Get Started with the Model

Use the code below to run inference in a Kaggle or Colab environment.

import torch
from PIL import Image
from transformers import AutoProcessor, AutoModelForImageTextToText
from peft import PeftModel

# 1. Load Model and Processor
model_id = "google/paligemma-3b-mix-224"
adapter_id = "barqawiz/paligemma-leprosy-detector-v1"

processor = AutoProcessor.from_pretrained(model_id)
base_model = AutoModelForImageTextToText.from_pretrained(
    model_id, 
    torch_dtype=torch.float16, 
    device_map="auto",
    quantization_config={"load_in_4bit": True}
)
model = PeftModel.from_pretrained(base_model, adapter_id)

# 2. Prepare Input
image = Image.open("lesion_sample.jpg").convert("RGB")
prompt = "<image>Task: Analyze this dermatological image. Is this Leprosy? Answer 'Yes' or 'No'."

# Note: PaliGemma processor uses the 'suffix' parameter for training, 
# but for inference, we pass the text as 'text'
inputs = processor(text=prompt, images=image, return_tensors="pt").to(model.device)

# 3. Generate
with torch.no_grad():
    output = model.generate(**inputs, max_new_tokens=15)
    result = processor.decode(output[0], skip_special_tokens=True)
    # Strip prompt to get answer
    clean_result = result.replace(prompt.replace("<image>", ""), "").strip()
    print(f"Is it Leprosy? {clean_result}")

Downloads last month: 2

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for barqawiz/paligemma-leprosy-detector-v1

Base model

google/paligemma-3b-mix-224

Adapter

(27)

this model