Image-to-Text
PEFT
Safetensors
medical-imaging
chest-xray
dermoscopy
vision-language
fairness
lora
mimic-cxr
padchest
ham10000
Instructions to use mbhosale/FairLLaVA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use mbhosale/FairLLaVA with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| library_name: peft | |
| pipeline_tag: image-to-text | |
| base_model: | |
| - lmsys/vicuna-7b-v1.5 | |
| - liuhaotian/llava-v1.5-7b | |
| tags: | |
| - medical-imaging | |
| - chest-xray | |
| - dermoscopy | |
| - vision-language | |
| - fairness | |
| - lora | |
| - peft | |
| - mimic-cxr | |
| - padchest | |
| - ham10000 | |
| datasets: | |
| - physionet/mimic-cxr-jpg | |
| # FairLLaVA — Pretrained Checkpoints | |
| Fairness-aware LoRA adapters for medical vision–language models, from the | |
| [FairLLaVA paper](https://arxiv.org/abs/2603.26008). | |
| - **Code**: [github.com/bhosalems/FairLLaVA](https://github.com/bhosalems/FairLLaVA) | |
| - **Paper**: [arxiv.org/abs/2603.26008](https://arxiv.org/abs/2603.26008) | |
| FairLLaVA minimizes the mutual information between the model's visual | |
| features and patient demographic attributes (age, sex, race), | |
| producing demographic-invariant representations while preserving clinical | |
| accuracy. The adapters here plug into a standard LoRA fine-tuning loop and | |
| are released on three medical benchmarks. | |
| ## Checkpoints | |
| | Subdir | Dataset | Base LLM | Vision Tower | Task | | |
| |---|---|---|---|---| | |
| | [`mimic-cxr/`](https://huggingface.co/mbhosale/FairLLaVA/tree/main/mimic-cxr) | MIMIC-CXR | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 | Chest X-ray report generation | | |
| | [`padchest/`](./padchest) | PadChest | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 | Chest X-ray report generation | | |
| | [`ham10000/`](./ham10000) | HAM10000 | `liuhaotian/llava-v1.5-7b` | CLIP ViT-L/14-336 | Dermoscopy VQA | | |
| Each subdirectory contains the LoRA adapter (`adapter_model.safetensors`, | |
| `adapter_config.json`, `non_lora_trainables.bin`), the matching multimodal | |
| projector (`mm_projector.bin`), and the tokenizer files, so the path can be | |
| loaded as `model_path` directly by | |
| `llava.model.builder.load_pretrained_model`. | |
| ## Quick start | |
| ```python | |
| from huggingface_hub import snapshot_download | |
| from llava.model.builder import load_pretrained_model | |
| # Download just one dataset's checkpoint | |
| local_dir = snapshot_download( | |
| repo_id="mbhosale/FairLLaVA", | |
| allow_patterns="mimic-cxr/*", | |
| ) | |
| model_path = f"{local_dir}/mimic-cxr" | |
| tokenizer, model, image_processor, ctx_len = load_pretrained_model( | |
| model_path, | |
| model_base="lmsys/vicuna-7b-v1.5", | |
| model_name="llavarad", | |
| ) | |
| ``` | |
| See the full inference example in [`inference.py`](https://github.com/bhosalems/FairLLaVA/blob/main/inference.py). | |
| ## Ethics | |
| These checkpoints are released **for research and educational use only**. | |
| They are **not** approved or validated for clinical or diagnostic use and | |
| must not be used to make medical decisions or to inform patient care. Each | |
| downstream dataset is governed by its own data-use agreement (PhysioNet for | |
| MIMIC-CXR, BIMCV for PadChest, ISIC / Harvard Dataverse for HAM10000). | |
| ## Citation | |
| ```bibtex | |
| @article{bhosale2026fairllava, | |
| title={FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants}, | |
| author={Bhosale, Mahesh and Wasi, Abdul and Srivastava, Shantam and Latif, Shifa and Luan, Tianyu and Gao, Mingchen and Doermann, David and Gong, Xuan}, | |
| journal={arXiv preprint arXiv:2603.26008}, | |
| year={2026} | |
| } | |
| ``` | |