Image-to-Text
PEFT
Safetensors
medical-imaging
chest-xray
dermoscopy
vision-language
fairness
lora
mimic-cxr
padchest
ham10000
Instructions to use mbhosale/FairLLaVA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use mbhosale/FairLLaVA with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
File size: 3,148 Bytes
25fce7f 0a7ec01 25fce7f 0a7ec01 0f68659 0a7ec01 f8ede90 0a7ec01 d104bc7 0a7ec01 d104bc7 0a7ec01 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | ---
license: apache-2.0
library_name: peft
pipeline_tag: image-to-text
base_model:
- lmsys/vicuna-7b-v1.5
- liuhaotian/llava-v1.5-7b
tags:
- medical-imaging
- chest-xray
- dermoscopy
- vision-language
- fairness
- lora
- peft
- mimic-cxr
- padchest
- ham10000
datasets:
- physionet/mimic-cxr-jpg
---
# FairLLaVA — Pretrained Checkpoints
Fairness-aware LoRA adapters for medical vision–language models, from the
[FairLLaVA paper](https://arxiv.org/abs/2603.26008).
- **Code**: [github.com/bhosalems/FairLLaVA](https://github.com/bhosalems/FairLLaVA)
- **Paper**: [arxiv.org/abs/2603.26008](https://arxiv.org/abs/2603.26008)
FairLLaVA minimizes the mutual information between the model's visual
features and patient demographic attributes (age, sex, race),
producing demographic-invariant representations while preserving clinical
accuracy. The adapters here plug into a standard LoRA fine-tuning loop and
are released on three medical benchmarks.
## Checkpoints
| Subdir | Dataset | Base LLM | Vision Tower | Task |
|---|---|---|---|---|
| [`mimic-cxr/`](https://huggingface.co/mbhosale/FairLLaVA/tree/main/mimic-cxr) | MIMIC-CXR | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 | Chest X-ray report generation |
| [`padchest/`](./padchest) | PadChest | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 | Chest X-ray report generation |
| [`ham10000/`](./ham10000) | HAM10000 | `liuhaotian/llava-v1.5-7b` | CLIP ViT-L/14-336 | Dermoscopy VQA |
Each subdirectory contains the LoRA adapter (`adapter_model.safetensors`,
`adapter_config.json`, `non_lora_trainables.bin`), the matching multimodal
projector (`mm_projector.bin`), and the tokenizer files, so the path can be
loaded as `model_path` directly by
`llava.model.builder.load_pretrained_model`.
## Quick start
```python
from huggingface_hub import snapshot_download
from llava.model.builder import load_pretrained_model
# Download just one dataset's checkpoint
local_dir = snapshot_download(
repo_id="mbhosale/FairLLaVA",
allow_patterns="mimic-cxr/*",
)
model_path = f"{local_dir}/mimic-cxr"
tokenizer, model, image_processor, ctx_len = load_pretrained_model(
model_path,
model_base="lmsys/vicuna-7b-v1.5",
model_name="llavarad",
)
```
See the full inference example in [`inference.py`](https://github.com/bhosalems/FairLLaVA/blob/main/inference.py).
## Ethics
These checkpoints are released **for research and educational use only**.
They are **not** approved or validated for clinical or diagnostic use and
must not be used to make medical decisions or to inform patient care. Each
downstream dataset is governed by its own data-use agreement (PhysioNet for
MIMIC-CXR, BIMCV for PadChest, ISIC / Harvard Dataverse for HAM10000).
## Citation
```bibtex
@article{bhosale2026fairllava,
title={FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants},
author={Bhosale, Mahesh and Wasi, Abdul and Srivastava, Shantam and Latif, Shifa and Luan, Tianyu and Gao, Mingchen and Doermann, David and Gong, Xuan},
journal={arXiv preprint arXiv:2603.26008},
year={2026}
}
```
|