Model Overview
MedSigLIP is a variant of SigLIP (Sigmoid Loss for Language Image Pre-training) that is trained to encode medical images and text into a common embedding space. Developers can use MedSigLIP to accelerate building healthcare-based AI applications. MedSigLIP contains a 400M parameter vision encoder and 400M parameter text encoder, it supports 448x448 image resolution with up to 64 text tokens.
MedSigLIP was trained on a variety of de-identified medical image and text pairs, including chest X-rays, dermatology images, ophthalmology images, histopathology slides, and slices of CT and MRI volumes, along with associated descriptions or reports. This training data was combined with natural (non-medical) image and text pairs to retain MedSigLIP's ability to parse natural images.
MedSigLIP is recommended for medical image interpretation applications without a need for text generation, such as data-efficient classification, zero-shot classification, and semantic image retrieval. For medical applications that require text generation, MedGemma is recommended.
Links
- [MedSigLIP Quickstart Notebook](coming soon..!)
- MedSigLIP API Documentation
- MedSigLIPModel Card
- KerasHub Beginner Guide
- KerasHub Model Publishing Guide
Installation
Keras and KerasHub can be installed with:
pip install -U -q keras-hub
pip install -U -q keras
JAX, TensorFlow, and Torch come pre-installed in Kaggle Notebooks. For instructions on installing them in another environment, see the Keras Getting Started page.
Presets
The following model checkpoints are provided by the Keras team. For each of the presets, we provide code examples in the tab below.
| Preset name | Parameters | Description |
|---|---|---|
medsiglip_900m_448 |
900M | 0.9B billion parameter, MedSigLIP contains a 400M parameter vision encoder and 400M parameter text encoder, it supports 448x448 image resolution with up to 64 text tokens. |
- Downloads last month
- -