mbhosale commited on
Commit
0a7ec01
·
verified ·
1 Parent(s): bbb371d

Add top-level model card

Browse files
Files changed (1) hide show
  1. README.md +98 -7
README.md CHANGED
@@ -1,16 +1,107 @@
1
  ---
2
  license: apache-2.0
3
- tags: [medical-imaging, fairness, lora]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
  # FairLLaVA — Pretrained Checkpoints
7
 
8
  Fairness-aware LoRA adapters for medical vision–language models, from the
9
  [FairLLaVA paper](https://arxiv.org/abs/2603.26008).
10
- Code: https://github.com/bhosalems/FairLLaVA
11
 
12
- | Subdir | Dataset | Base LLM | Vision Tower |
13
- |---|---|---|---|
14
- | [`mimic-cxr/`](./mimic-cxr) | MIMIC-CXR | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 |
15
- | [`padchest/`](./padchest) | PadChest | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 |
16
- | [`ham10000/`](./ham10000) | HAM10000 | `liuhaotian/llava-v1.5-7b` | CLIP ViT-L/14-336 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ library_name: peft
4
+ pipeline_tag: image-to-text
5
+ base_model:
6
+ - lmsys/vicuna-7b-v1.5
7
+ - liuhaotian/llava-v1.5-7b
8
+ tags:
9
+ - medical-imaging
10
+ - chest-xray
11
+ - dermoscopy
12
+ - vision-language
13
+ - fairness
14
+ - lora
15
+ - peft
16
+ - mimic-cxr
17
+ - padchest
18
+ - ham10000
19
+ datasets:
20
+ - physionet/mimic-cxr-jpg
21
  ---
22
 
23
  # FairLLaVA — Pretrained Checkpoints
24
 
25
  Fairness-aware LoRA adapters for medical vision–language models, from the
26
  [FairLLaVA paper](https://arxiv.org/abs/2603.26008).
 
27
 
28
+ - **Code**: [github.com/bhosalems/FairLLaVA](https://github.com/bhosalems/FairLLaVA)
29
+ - **Paper**: [arxiv.org/abs/2603.26008](https://arxiv.org/abs/2603.26008)
30
+
31
+ FairLLaVA minimizes the mutual information between the model's visual
32
+ features and patient demographic attributes (age, sex, race / skin type),
33
+ producing demographic-invariant representations while preserving clinical
34
+ accuracy. The adapters here plug into a standard LoRA fine-tuning loop and
35
+ are released on three medical benchmarks.
36
+
37
+ ## Checkpoints
38
+
39
+ | Subdir | Dataset | Base LLM | Vision Tower | Task |
40
+ |---|---|---|---|---|
41
+ | [`mimic-cxr/`](./mimic-cxr) | MIMIC-CXR | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 | Chest X-ray report generation |
42
+ | [`padchest/`](./padchest) | PadChest | `lmsys/vicuna-7b-v1.5` | BiomedCLIP-CXR-518 | Chest X-ray report generation |
43
+ | [`ham10000/`](./ham10000) | HAM10000 | `liuhaotian/llava-v1.5-7b` | CLIP ViT-L/14-336 | Dermoscopy VQA |
44
+
45
+ Each subdirectory contains the LoRA adapter (`adapter_model.safetensors`,
46
+ `adapter_config.json`, `non_lora_trainables.bin`), the matching multimodal
47
+ projector (`mm_projector.bin`), and the tokenizer files, so the path can be
48
+ loaded as `model_path` directly by
49
+ `llava.model.builder.load_pretrained_model`.
50
+
51
+ ## Quick start
52
+
53
+ ```python
54
+ from huggingface_hub import snapshot_download
55
+ from llava.model.builder import load_pretrained_model
56
+
57
+ # Download just one dataset's checkpoint
58
+ local_dir = snapshot_download(
59
+ repo_id="mbhosale/FairLLaVA",
60
+ allow_patterns="mimic-cxr/*",
61
+ )
62
+ model_path = f"{local_dir}/mimic-cxr"
63
+
64
+ tokenizer, model, image_processor, ctx_len = load_pretrained_model(
65
+ model_path,
66
+ model_base="lmsys/vicuna-7b-v1.5",
67
+ model_name="llavarad",
68
+ )
69
+ ```
70
+
71
+ See the full inference example in [`inference.py`](https://github.com/bhosalems/FairLLaVA/blob/main/inference.py).
72
+
73
+ ## Ethics
74
+
75
+ These checkpoints are released **for research and educational use only**.
76
+ They are **not** approved or validated for clinical or diagnostic use and
77
+ must not be used to make medical decisions or to inform patient care. Each
78
+ downstream dataset is governed by its own data-use agreement (PhysioNet for
79
+ MIMIC-CXR, BIMCV for PadChest, ISIC / Harvard Dataverse for HAM10000).
80
+
81
+ ## Citation
82
+
83
+ ```bibtex
84
+ @misc{bhosale2026fairllava,
85
+ title={FairLLaVA: Fairness-Aware Parameter-Efficient Fine-Tuning for Large Vision-Language Assistants},
86
+ author={Mahesh Bhosale and Abdul Wasi and Shantam Srivastava and Shifa Latif and Tianyu Luan and Mingchen Gao and David Doermann and Xuan Gong},
87
+ year={2026},
88
+ eprint={2603.26008},
89
+ archivePrefix={arXiv},
90
+ primaryClass={cs.CV},
91
+ url={https://arxiv.org/abs/2603.26008}
92
+ }
93
+
94
+ @article{ZambranoChaves2025,
95
+ title={A clinically accessible small multimodal radiology model and evaluation metric for chest X-ray findings},
96
+ author={Zambrano Chaves, Juan Manuel and others},
97
+ journal={Nature Communications}, year={2025}, volume={16}, pages={3108},
98
+ doi={10.1038/s41467-025-58344-x}
99
+ }
100
+
101
+ @misc{liu2023improvedllava,
102
+ title={Improved Baselines with Visual Instruction Tuning},
103
+ author={Liu, Haotian and Li, Chunyuan and Li, Yuheng and Lee, Yong Jae},
104
+ publisher={arXiv:2310.03744},
105
+ year={2023}
106
+ }
107
+ ```