Add library name and update model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +15 -16
README.md CHANGED
@@ -1,11 +1,13 @@
1
  ---
2
- license: apache-2.0
3
- language:
4
- - en
5
  base_model:
6
  - black-forest-labs/FLUX.2-klein-4B
 
 
 
7
  pipeline_tag: text-to-image
 
8
  ---
 
9
  <div align="center">
10
 
11
  <img width="70%" height="70%" alt="logo" src="https://cdn-uploads.huggingface.co/production/uploads/64b500fdf460afaefc5c64b3/l1JM1Si5PDCgvJR5SSiqf.png" />
@@ -14,7 +16,7 @@ pipeline_tag: text-to-image
14
 
15
  <p><b>Reward-Tilted DMD &nbsp;·&nbsp; Ambient-Consistent Distillation &nbsp;·&nbsp; Hybrid Policy Gradient</b></p>
16
 
17
- [![Paper](https://img.shields.io/badge/paper-arXiv-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2605.26108)
18
  [![Github](https://img.shields.io/badge/Harahan%2FRTDMD-000000?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Harahan/RTDMD)
19
  [![Hugging Face Collection](https://img.shields.io/badge/RTDMD_Collection-fcd022?style=for-the-badge&logo=huggingface&logoColor=000)](https://huggingface.co/collections/Harahan/rtdmd)
20
 
@@ -50,6 +52,8 @@ With **4 NFE** RTDMD reaches new SOTA on SD3-M / SD3.5-M / FLUX.2 4B; the
50
  distilled FLUX.2 4B even beats the full FLUX.2 9B teacher (50 NFE) on most
51
  rewards.
52
 
 
 
53
  <table align="center">
54
  <tr>
55
  <td align="center" width="50%">
@@ -78,8 +82,11 @@ rewards.
78
  For the generator $G_\theta$, the reward-tilted KL objective decomposes as
79
 
80
  $$
81
- \nabla_\theta D_{\text{KL}}(p_\theta \| \tilde{p}_\psi) =
82
- \underbrace{\nabla_\theta D_{\text{KL}}(p_\theta \| p_\psi)}_{\text{distribution matching}} - \beta\underbrace{\nabla_\theta \mathbb{E}_{\hat{\mathbf{x}}_0 \sim p_\theta}[r(\hat{\mathbf{x}}_0)]}_{\text{reward maximization}}.
 
 
 
83
  $$
84
 
85
  The two terms map directly to the two trainers exposed by the CLI:
@@ -139,6 +146,7 @@ python inference.py configs/inference/flux2_4b.yaml \
139
  import torch
140
  from diffusers import Flux2KleinPipeline, Flux2Transformer2DModel
141
  from huggingface_hub import hf_hub_download
 
142
 
143
  base = "black-forest-labs/FLUX.2-klein-4B"
144
  pipe = Flux2KleinPipeline.from_pretrained(base, torch_dtype=torch.bfloat16).to("cuda")
@@ -183,13 +191,4 @@ pipe(prompt="a cute cat sitting on a windowsill",
183
  primaryClass={cs.CV},
184
  url={https://arxiv.org/abs/2605.26108},
185
  }
186
- ```
187
-
188
- ---
189
-
190
- ## ⚖️ License
191
-
192
- Apache 2.0 — same as the upstream
193
- [RTDMD](https://github.com/Harahan/RTDMD) repo. The base model
194
- [`black-forest-labs/FLUX.2-klein-4B`](https://huggingface.co/black-forest-labs/FLUX.2-klein-4B)
195
- is governed by its own license; please review and comply with it separately.
 
1
  ---
 
 
 
2
  base_model:
3
  - black-forest-labs/FLUX.2-klein-4B
4
+ language:
5
+ - en
6
+ license: apache-2.0
7
  pipeline_tag: text-to-image
8
+ library_name: diffusers
9
  ---
10
+
11
  <div align="center">
12
 
13
  <img width="70%" height="70%" alt="logo" src="https://cdn-uploads.huggingface.co/production/uploads/64b500fdf460afaefc5c64b3/l1JM1Si5PDCgvJR5SSiqf.png" />
 
16
 
17
  <p><b>Reward-Tilted DMD &nbsp;·&nbsp; Ambient-Consistent Distillation &nbsp;·&nbsp; Hybrid Policy Gradient</b></p>
18
 
19
+ [![Paper](https://img.shields.io/badge/paper-arXiv-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://huggingface.co/papers/2605.26108)
20
  [![Github](https://img.shields.io/badge/Harahan%2FRTDMD-000000?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Harahan/RTDMD)
21
  [![Hugging Face Collection](https://img.shields.io/badge/RTDMD_Collection-fcd022?style=for-the-badge&logo=huggingface&logoColor=000)](https://huggingface.co/collections/Harahan/rtdmd)
22
 
 
52
  distilled FLUX.2 4B even beats the full FLUX.2 9B teacher (50 NFE) on most
53
  rewards.
54
 
55
+ More details can be found in the paper: [Reinforcing Few-step Generators via Reward-Tilted Distribution Matching](https://huggingface.co/papers/2605.26108).
56
+
57
  <table align="center">
58
  <tr>
59
  <td align="center" width="50%">
 
82
  For the generator $G_\theta$, the reward-tilted KL objective decomposes as
83
 
84
  $$
85
+
86
+ abla_\theta D_{\text{KL}}(p_\theta \| \tilde{p}_\psi) =
87
+ \underbrace{
88
+ abla_\theta D_{\text{KL}}(p_\theta \| p_\psi)}_{\text{distribution matching}} - \beta\underbrace{
89
+ abla_\theta \mathbb{E}_{\hat{\mathbf{x}}_0 \sim p_\theta}[r(\hat{\mathbf{x}}_0)]}_{\text{reward maximization}}.
90
  $$
91
 
92
  The two terms map directly to the two trainers exposed by the CLI:
 
146
  import torch
147
  from diffusers import Flux2KleinPipeline, Flux2Transformer2DModel
148
  from huggingface_hub import hf_hub_download
149
+ from peft import LoraConfig
150
 
151
  base = "black-forest-labs/FLUX.2-klein-4B"
152
  pipe = Flux2KleinPipeline.from_pretrained(base, torch_dtype=torch.bfloat16).to("cuda")
 
191
  primaryClass={cs.CV},
192
  url={https://arxiv.org/abs/2605.26108},
193
  }
194
+ ```