Qwen2.5-VL-7B — Cell Captioning
Fine-tuned Qwen2.5-VL-7B-Instruct to generate biomedical descriptions of cells in fluorescence microscopy images. Part of the biomech-inference-serving pipeline (internal research project).
Training
| Base model | Qwen/Qwen2.5-VL-7B-Instruct |
| Training data | DnaRnaProteins/cell_seg_labeled |
| Fine-tuning | QLoRA (4-bit, PEFT) via TRL SFTTrainer |
| Evaluation | ROUGE-L on validation split before push |
Usage
import torch
from PIL import Image
from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration
from qwen_vl_utils import process_vision_info
model_id = "DnaRnaProteins/qwen2.5-vl-7b-cells-cap"
processor = AutoProcessor.from_pretrained(model_id)
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
model_id, torch_dtype=torch.bfloat16, device_map="auto"
)
image = Image.open("cell_image.png").convert("RGB")
messages = [{
"role": "user",
"content": [
{"type": "image", "image": image},
{"type": "text", "text": (
"You are a biomedical imaging expert. Describe what you observe in this "
"microscopy image of cells. Include cell morphology, density, any visible "
"structures, and any notable features relevant to biomechanics analysis."
)},
],
}]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
image_inputs, _ = process_vision_info(messages)
inputs = processor(text=[text], images=image_inputs, return_tensors="pt").to(model.device)
with torch.inference_mode():
out = model.generate(**inputs, max_new_tokens=256)
caption = processor.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
Via Modal endpoint
import base64, modal
caption_fn = modal.Function.from_name("biomech-inference-serving", "caption")
with open("cell_image.png", "rb") as f:
b64 = base64.b64encode(f.read()).decode()
result = caption_fn.remote(b64)
# {"caption": "The image shows densely packed epithelial cells..."}
Limitations
- Descriptions are intended as a research aid, not clinical guidance.
- Trained on fluorescence cell images; other imaging modalities are out-of-distribution.
- Downloads last month
- 323
Model tree for DnaRnaProteins/qwen2.5-vl-7b-cells-cap
Base model
Qwen/Qwen2.5-VL-7B-Instruct