--- license: apache-2.0 library_name: peft pipeline_tag: image-to-text base_model: - lmsys/vicuna-7b-v1.5 - liuhaotian/llava-v1.5-7b tags: - medical-imaging - chest-xray - dermoscopy - vision-language - fairness - lora - peft - mimic-cxr - padchest - ham10000 datasets: - physionet/mimic-cxr-jpg --- # FairLLaVA — Pretrained Checkpoints Fairness-aware LoRA adapters for medical vision–language models, from the [FairLLaVA paper](https://arxiv.org/abs/2603.26008). - **Code**: [github.com/bhosalems/FairLLaVA](https://github.com/bhosalems/FairLLaVA) - **Paper**: [arxiv.org/abs/2603.26008](https://arxiv.org/abs/2603.26008) FairLLaVA minimizes the mutual information between the model's visual features and patient demographic attributes (age, sex, race), producing demographic-invariant representations while preserving clinical accuracy. The adapters here plug into a standard LoRA fine-tuning loop and are released on three medical benchmarks. ## Checkpoints | Subdir | Dataset | Base LLM | Vision Tower | Task | |---|---|---|---|---| | [`mimic-cxr/`](https://huggingface.co/mbhosale/FairLLaVA/tree/main/mimic-cxr) | MIMIC-CXR | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 | Chest X-ray report generation | | [`padchest/`](./padchest) | PadChest | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 | Chest X-ray report generation | | [`ham10000/`](./ham10000) | HAM10000 | `liuhaotian/llava-v1.5-7b` | CLIP ViT-L/14-336 | Dermoscopy VQA | Each subdirectory contains the LoRA adapter (`adapter_model.safetensors`, `adapter_config.json`, `non_lora_trainables.bin`), the matching multimodal projector (`mm_projector.bin`), and the tokenizer files, so the path can be loaded as `model_path` directly by `llava.model.builder.load_pretrained_model`. ## Quick start ```python from huggingface_hub import snapshot_download from llava.model.builder import load_pretrained_model # Download just one dataset's checkpoint local_dir = snapshot_download( repo_id="mbhosale/FairLLaVA", allow_patterns="mimic-cxr/*", ) model_path = f"{local_dir}/mimic-cxr" tokenizer, model, image_processor, ctx_len = load_pretrained_model( model_path, model_base="lmsys/vicuna-7b-v1.5", model_name="llavarad", ) ``` See the full inference example in [`inference.py`](https://github.com/bhosalems/FairLLaVA/blob/main/inference.py). ## Ethics These checkpoints are released **for research and educational use only**. They are **not** approved or validated for clinical or diagnostic use and must not be used to make medical decisions or to inform patient care. Each downstream dataset is governed by its own data-use agreement (PhysioNet for MIMIC-CXR, BIMCV for PadChest, ISIC / Harvard Dataverse for HAM10000). ## Citation ```bibtex @article{bhosale2026fairllava, title={FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants}, author={Bhosale, Mahesh and Wasi, Abdul and Srivastava, Shantam and Latif, Shifa and Luan, Tianyu and Gao, Mingchen and Doermann, David and Gong, Xuan}, journal={arXiv preprint arXiv:2603.26008}, year={2026} } ```