Update README.md
Browse files
README.md
CHANGED
|
@@ -32,7 +32,7 @@ The checkpoints correspond to the `DataComp-DFN (130M)` results reported in the
|
|
| 32 |
|
| 33 |
- Code: `https://github.com/MingliangLiang3/DynamiCS`
|
| 34 |
- Implementation base: `https://github.com/mlfoundations/open_clip`
|
| 35 |
-
- Paper title: `Dynamic Cluster Data Sampling for Efficient and Long-Tail-Aware Vision-Language Pre-training`
|
| 36 |
|
| 37 |
## Intended Uses
|
| 38 |
|
|
@@ -44,18 +44,10 @@ These checkpoints are intended for:
|
|
| 44 |
- image and text embedding extraction within the OpenCLIP framework
|
| 45 |
- benchmarking on long-tail evaluation datasets such as Let It Wag!
|
| 46 |
|
| 47 |
-
### Out-of-scope use
|
| 48 |
-
|
| 49 |
-
These checkpoints are not intended for:
|
| 50 |
-
|
| 51 |
-
- safety-critical or high-risk decision making
|
| 52 |
-
- surveillance or biometric identification
|
| 53 |
-
- medical, legal, or financial decisions
|
| 54 |
-
- production use without additional evaluation, monitoring, and risk assessment
|
| 55 |
|
| 56 |
## How to Use
|
| 57 |
|
| 58 |
-
These files are stored as **training checkpoints**
|
| 59 |
|
| 60 |
```python
|
| 61 |
import open_clip
|
|
@@ -120,17 +112,6 @@ The primary reported metrics for these checkpoints are zero-shot top-1 classific
|
|
| 120 |
|
| 121 |
These results are taken from the project repository and accompanying paper draft.
|
| 122 |
|
| 123 |
-
## Limitations and Biases
|
| 124 |
-
|
| 125 |
-
Like other CLIP-style models trained on large-scale web data, these checkpoints may:
|
| 126 |
-
|
| 127 |
-
- reflect social, geographic, cultural, and language biases present in web-scale image-text corpora
|
| 128 |
-
- underperform on domains that differ substantially from the training distribution
|
| 129 |
-
- produce incorrect or overconfident predictions for rare, ambiguous, or sensitive concepts
|
| 130 |
-
- improve long-tail benchmark performance without guaranteeing fairness or robustness across all subpopulations
|
| 131 |
-
|
| 132 |
-
Users should evaluate the checkpoints carefully on their own tasks before any downstream use.
|
| 133 |
-
|
| 134 |
## License
|
| 135 |
|
| 136 |
The underlying code repository is released under the MIT License. Model users are responsible for ensuring that their use and any redistribution of checkpoints comply with the terms, restrictions, and policies associated with the underlying training data and their deployment context.
|
|
|
|
| 32 |
|
| 33 |
- Code: `https://github.com/MingliangLiang3/DynamiCS`
|
| 34 |
- Implementation base: `https://github.com/mlfoundations/open_clip`
|
| 35 |
+
- Paper title: `Dynamic Cluster Data Sampling for Efficient and Long-Tail-Aware Vision-Language Pre-training.`
|
| 36 |
|
| 37 |
## Intended Uses
|
| 38 |
|
|
|
|
| 44 |
- image and text embedding extraction within the OpenCLIP framework
|
| 45 |
- benchmarking on long-tail evaluation datasets such as Let It Wag!
|
| 46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
|
| 48 |
## How to Use
|
| 49 |
|
| 50 |
+
These files are stored as **training checkpoints**, not as Hub-native exported `open_clip_pytorch_model.bin` weights. They can be loaded with the DynamiCS/OpenCLIP codebase using `open_clip.load_checkpoint`, which extracts the `state_dict` automatically when needed.
|
| 51 |
|
| 52 |
```python
|
| 53 |
import open_clip
|
|
|
|
| 112 |
|
| 113 |
These results are taken from the project repository and accompanying paper draft.
|
| 114 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 115 |
## License
|
| 116 |
|
| 117 |
The underlying code repository is released under the MIT License. Model users are responsible for ensuring that their use and any redistribution of checkpoints comply with the terms, restrictions, and policies associated with the underlying training data and their deployment context.
|