mbhosale
/

FairLLaVA

@@ -1,16 +1,107 @@
 ---
 license: apache-2.0
-tags: [medical-imaging, fairness, lora]
 ---
 # FairLLaVA — Pretrained Checkpoints
 Fairness-aware LoRA adapters for medical vision–language models, from the
 [FairLLaVA paper](https://arxiv.org/abs/2603.26008).
-Code: https://github.com/bhosalems/FairLLaVA
-| Subdir | Dataset | Base LLM | Vision Tower |
-|---|---|---|---|
-| [`mimic-cxr/`](./mimic-cxr) | MIMIC-CXR | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 |
-| [`padchest/`](./padchest)   | PadChest  | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 |
-| [`ham10000/`](./ham10000)   | HAM10000  | `liuhaotian/llava-v1.5-7b` | CLIP ViT-L/14-336 |

 ---
 license: apache-2.0
+library_name: peft
+pipeline_tag: image-to-text
+base_model:
+  - lmsys/vicuna-7b-v1.5
+  - liuhaotian/llava-v1.5-7b
+tags:
+  - medical-imaging
+  - chest-xray
+  - dermoscopy
+  - vision-language
+  - fairness
+  - lora
+  - peft
+  - mimic-cxr
+  - padchest
+  - ham10000
+datasets:
+  - physionet/mimic-cxr-jpg
 ---
 # FairLLaVA — Pretrained Checkpoints
 Fairness-aware LoRA adapters for medical vision–language models, from the
 [FairLLaVA paper](https://arxiv.org/abs/2603.26008).
+- **Code**: [github.com/bhosalems/FairLLaVA](https://github.com/bhosalems/FairLLaVA)
+- **Paper**: [arxiv.org/abs/2603.26008](https://arxiv.org/abs/2603.26008)
+FairLLaVA minimizes the mutual information between the model's visual
+features and patient demographic attributes (age, sex, race / skin type),
+producing demographic-invariant representations while preserving clinical
+accuracy. The adapters here plug into a standard LoRA fine-tuning loop and
+are released on three medical benchmarks.
+## Checkpoints
+| Subdir | Dataset | Base LLM | Vision Tower | Task |
+|---|---|---|---|---|
+| [`mimic-cxr/`](./mimic-cxr) | MIMIC-CXR | `lmsys/vicuna-7b-v1.5`     | BiomedCLIP-CXR-518     | Chest X-ray report generation |
+| [`padchest/`](./padchest)   | PadChest  | `lmsys/vicuna-7b-v1.5`     | BiomedCLIP-CXR-518     | Chest X-ray report generation |
+| [`ham10000/`](./ham10000)   | HAM10000  | `liuhaotian/llava-v1.5-7b` | CLIP ViT-L/14-336      | Dermoscopy VQA |
+Each subdirectory contains the LoRA adapter (`adapter_model.safetensors`,
+`adapter_config.json`, `non_lora_trainables.bin`), the matching multimodal
+projector (`mm_projector.bin`), and the tokenizer files, so the path can be
+loaded as `model_path` directly by
+`llava.model.builder.load_pretrained_model`.
+## Quick start
+```python
+from huggingface_hub import snapshot_download
+from llava.model.builder import load_pretrained_model
+# Download just one dataset's checkpoint
+local_dir = snapshot_download(
+    repo_id="mbhosale/FairLLaVA",
+    allow_patterns="mimic-cxr/*",
+)
+model_path = f"{local_dir}/mimic-cxr"
+tokenizer, model, image_processor, ctx_len = load_pretrained_model(
+    model_path,
+    model_base="lmsys/vicuna-7b-v1.5",
+    model_name="llavarad",
+)
+```
+See the full inference example in [`inference.py`](https://github.com/bhosalems/FairLLaVA/blob/main/inference.py).
+## Ethics
+These checkpoints are released **for research and educational use only**.
+They are **not** approved or validated for clinical or diagnostic use and
+must not be used to make medical decisions or to inform patient care. Each
+downstream dataset is governed by its own data-use agreement (PhysioNet for
+MIMIC-CXR, BIMCV for PadChest, ISIC / Harvard Dataverse for HAM10000).
+## Citation
+```bibtex
+@misc{bhosale2026fairllava,
+  title={FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants},
+  author={Mahesh Bhosale and Abdul Wasi and Shantam Srivastava and Shifa Latif and Tianyu Luan and Mingchen Gao and David Doermann and Xuan Gong},
+  year={2026},
+  eprint={2603.26008},
+  archivePrefix={arXiv},
+  primaryClass={cs.CV},
+  url={https://arxiv.org/abs/2603.26008}
+}
+@article{ZambranoChaves2025,
+  title={A clinically accessible small multimodal radiology model and evaluation metric for chest X-ray findings},
+  author={Zambrano Chaves, Juan Manuel and others},
+  journal={Nature Communications}, year={2025}, volume={16}, pages={3108},
+  doi={10.1038/s41467-025-58344-x}
+}
+@misc{liu2023improvedllava,
+  title={Improved Baselines with Visual Instruction Tuning},
+  author={Liu, Haotian and Li, Chunyuan and Li, Yuheng and Lee, Yong Jae},
+  publisher={arXiv:2310.03744},
+  year={2023}
+}
+```