Add pipeline tag and improve model card
Browse filesHi! I'm Niels, part of the community science team at Hugging Face.
I've opened this PR to improve the model card for RADD:
- Added `pipeline_tag: text-generation` to the metadata to improve discoverability on the Hub.
- Added links to the official paper and GitHub repository.
- Included a sample usage snippet for loading the model, as documented in your repository.
- Added the BibTeX citation for researchers.
README.md
CHANGED
|
@@ -1,5 +1,38 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
| 2 |
|
| 3 |
-
|
| 4 |
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
pipeline_tag: text-generation
|
| 3 |
+
---
|
| 4 |
|
| 5 |
+
# RADD Small (lambda-dce)
|
| 6 |
|
| 7 |
+
This repository contains the small model checkpoint for **RADD (Reparameterized Absorbing Discrete Diffusion)**, trained with the $\lambda$-DCE loss for 400k iterations.
|
| 8 |
+
|
| 9 |
+
RADD is a discrete diffusion model designed for language modeling that characterizes time-independent conditional probabilities. This approach allows for sampling acceleration via caching strategies and unifies absorbing discrete diffusion with any-order autoregressive models (AO-ARMs).
|
| 10 |
+
|
| 11 |
+
- **Paper:** [Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data](https://huggingface.co/papers/2406.03736)
|
| 12 |
+
- **GitHub Repository:** [ML-GSAI/RADD](https://github.com/ML-GSAI/RADD)
|
| 13 |
+
|
| 14 |
+
## Usage
|
| 15 |
+
|
| 16 |
+
To use this model, you need to use the loading utility provided in the [official repository](https://github.com/ML-GSAI/RADD):
|
| 17 |
+
|
| 18 |
+
```python
|
| 19 |
+
from load_model import load_model
|
| 20 |
+
|
| 21 |
+
# Load the model and noise schedule
|
| 22 |
+
model, noise = load_model('JingyangOu/radd-lambda-dce', device='cuda')
|
| 23 |
+
```
|
| 24 |
+
|
| 25 |
+
For more details on sampling (e.g., using the `DiffusionSampler` or `OrderedSampler`), please refer to the scripts in the GitHub repository.
|
| 26 |
+
|
| 27 |
+
## Citation
|
| 28 |
+
|
| 29 |
+
```bibtex
|
| 30 |
+
@misc{ou2024absorbingdiscretediffusionsecretly,
|
| 31 |
+
title={Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data},
|
| 32 |
+
author={Jingyang Ou and Shen Nie and Kaiwen Xue and Fengqi Zhu and Jiacheng Sun and Zhenguo Li and Chongxuan Li},
|
| 33 |
+
year={2024},
|
| 34 |
+
eprint={2406.03736},
|
| 35 |
+
archivePrefix={arXiv},
|
| 36 |
+
primaryClass={cs.LG},
|
| 37 |
+
}
|
| 38 |
+
```
|