Envision Eye Imaging Classifier
SetFit binary classifier for identifying eye imaging datasets from scientific metadata.
Developed by: FAIR Data Innovations Hub in collaboration with the EyeACT Study
Model Description
Uses sentence-transformers/all-mpnet-base-v2 as backbone with binary classification:
- EYE_IMAGING (1): Actual ophthalmic imaging datasets (fundus, OCT, OCTA, cornea)
- NEGATIVE (0): Everything else (software, non-imaging eye data, unrelated)
Validation
Spot-check (33 expert-verified Zenodo records)
| Metric | Score |
|---|---|
| Accuracy | 0.939 (31/33) |
| Macro F1 | 0.923 |
| EYE_IMAGING F1 | 0.889 (P=0.889, R=0.889) |
| NEGATIVE F1 | 0.958 (P=0.958, R=0.958) |
Held-out test set (20% stratified split)
| Metric | Score |
|---|---|
| Accuracy | 0.940 |
| Macro F1 | 0.936 |
| EYE_IMAGING F1 | 0.922 (P=0.887, R=0.959) |
| NEGATIVE F1 | 0.951 (P=0.975, R=0.929) |
Multi-repository spot-check (6,833 records across 6 sources)
| Source | Records | EYE_IMAGING F1 | Precision | Recall |
|---|---|---|---|---|
| Zenodo | 514 | 0.677 | 0.537 | 0.917 |
| DataCite | 1,836 | 0.866 | 0.858 | 0.874 |
| Figshare | 2,000 | 0.833 | 0.788 | 0.884 |
| Kaggle | 732 | 0.739 | 0.939 | 0.610 |
| Dryad | 89 | 0.764 | 0.750 | 0.778 |
| NEI | 1,662 | 0.814 | 0.931 | 0.724 |
| Overall | 6,833 | 0.822 | 0.845 | 0.800 |
Training
- Base model: sentence-transformers/all-mpnet-base-v2 (768-dimensional)
- Training data: 994 examples (365 EYE_IMAGING, 629 NEGATIVE) from multi-repository sources (Zenodo, Figshare, Dryad, Kaggle, NEI)
- Dataset: fairdataihub/envision-eye-imaging-training-data
- Epochs: 10 (early stopping, patience=3)
- Batch size: 16
- Learning rate: 2e-5 (default)
- Scheduler: linear with 10% warmup
Usage
from setfit import SetFitModel
model = SetFitModel.from_pretrained("fairdataihub/envision-eye-imaging-classifier")
predictions = model.predict(["Retinal OCT dataset for diabetic retinopathy"])
Citation
- EyeACT Envision project
- FAIR Data Innovations Hub (fairdataihub.org)
- sentence-transformers/all-mpnet-base-v2
Contact
EyeACT team: eyeactstudy.org
- Downloads last month
- 34