Harahan
/

FLUX2-4B-RTDMD

Text-to-Image

English

Model card Files Files and versions

xet

Community

Add library name and update model card

by nielsr HF Staff - opened about 8 hours ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+15

-16

Files changed (1) hide show

README.md +15 -16

README.md CHANGED Viewed

@@ -1,11 +1,13 @@
 ---
-license: apache-2.0
-language:
-- en
 base_model:
 - black-forest-labs/FLUX.2-klein-4B
 pipeline_tag: text-to-image
 ---
 <div align="center">
 <img width="70%" height="70%" alt="logo" src="https://cdn-uploads.huggingface.co/production/uploads/64b500fdf460afaefc5c64b3/l1JM1Si5PDCgvJR5SSiqf.png" />
@@ -14,7 +16,7 @@ pipeline_tag: text-to-image
 <p><b>Reward-Tilted DMD &nbsp;·&nbsp; Ambient-Consistent Distillation &nbsp;·&nbsp; Hybrid Policy Gradient</b></p>
-[![Paper](https://img.shields.io/badge/paper-arXiv-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2605.26108)
 [![Github](https://img.shields.io/badge/Harahan%2FRTDMD-000000?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Harahan/RTDMD)
 [![Hugging Face Collection](https://img.shields.io/badge/RTDMD_Collection-fcd022?style=for-the-badge&logo=huggingface&logoColor=000)](https://huggingface.co/collections/Harahan/rtdmd)
@@ -50,6 +52,8 @@ With **4 NFE** RTDMD reaches new SOTA on SD3-M / SD3.5-M / FLUX.2 4B; the
 distilled FLUX.2 4B even beats the full FLUX.2 9B teacher (50 NFE) on most
 rewards.
 <table align="center">
   <tr>
     <td align="center" width="50%">
@@ -78,8 +82,11 @@ rewards.
 For the generator $G_\theta$, the reward-tilted KL objective decomposes as
 $$
-\nabla_\theta D_{\text{KL}}(p_\theta \| \tilde{p}_\psi) =
-\underbrace{\nabla_\theta D_{\text{KL}}(p_\theta \| p_\psi)}_{\text{distribution matching}} - \beta\underbrace{\nabla_\theta \mathbb{E}_{\hat{\mathbf{x}}_0 \sim p_\theta}[r(\hat{\mathbf{x}}_0)]}_{\text{reward maximization}}.
 $$
 The two terms map directly to the two trainers exposed by the CLI:
@@ -139,6 +146,7 @@ python inference.py configs/inference/flux2_4b.yaml \
 import torch
 from diffusers import Flux2KleinPipeline, Flux2Transformer2DModel
 from huggingface_hub import hf_hub_download
 base = "black-forest-labs/FLUX.2-klein-4B"
 pipe = Flux2KleinPipeline.from_pretrained(base, torch_dtype=torch.bfloat16).to("cuda")
@@ -183,13 +191,4 @@ pipe(prompt="a cute cat sitting on a windowsill",
       primaryClass={cs.CV},
       url={https://arxiv.org/abs/2605.26108},
 }
-```
----
-## ⚖️ License
-Apache 2.0 — same as the upstream
-[RTDMD](https://github.com/Harahan/RTDMD) repo. The base model
-[`black-forest-labs/FLUX.2-klein-4B`](https://huggingface.co/black-forest-labs/FLUX.2-klein-4B)
-is governed by its own license; please review and comply with it separately.

 ---
 base_model:
 - black-forest-labs/FLUX.2-klein-4B
+language:
+- en
+license: apache-2.0
 pipeline_tag: text-to-image
+library_name: diffusers
 ---
 <div align="center">
 <img width="70%" height="70%" alt="logo" src="https://cdn-uploads.huggingface.co/production/uploads/64b500fdf460afaefc5c64b3/l1JM1Si5PDCgvJR5SSiqf.png" />
 <p><b>Reward-Tilted DMD &nbsp;·&nbsp; Ambient-Consistent Distillation &nbsp;·&nbsp; Hybrid Policy Gradient</b></p>
+[![Paper](https://img.shields.io/badge/paper-arXiv-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://huggingface.co/papers/2605.26108)
 [![Github](https://img.shields.io/badge/Harahan%2FRTDMD-000000?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Harahan/RTDMD)
 [![Hugging Face Collection](https://img.shields.io/badge/RTDMD_Collection-fcd022?style=for-the-badge&logo=huggingface&logoColor=000)](https://huggingface.co/collections/Harahan/rtdmd)
 distilled FLUX.2 4B even beats the full FLUX.2 9B teacher (50 NFE) on most
 rewards.
+More details can be found in the paper: [Reinforcing Few-step Generators via Reward-Tilted Distribution Matching](https://huggingface.co/papers/2605.26108).
 <table align="center">
   <tr>
     <td align="center" width="50%">
 For the generator $G_\theta$, the reward-tilted KL objective decomposes as
 $$
+abla_\theta D_{\text{KL}}(p_\theta \| \tilde{p}_\psi) =
+\underbrace{
+abla_\theta D_{\text{KL}}(p_\theta \| p_\psi)}_{\text{distribution matching}} - \beta\underbrace{
+abla_\theta \mathbb{E}_{\hat{\mathbf{x}}_0 \sim p_\theta}[r(\hat{\mathbf{x}}_0)]}_{\text{reward maximization}}.
 $$
 The two terms map directly to the two trainers exposed by the CLI:
 import torch
 from diffusers import Flux2KleinPipeline, Flux2Transformer2DModel
 from huggingface_hub import hf_hub_download
+from peft import LoraConfig
 base = "black-forest-labs/FLUX.2-klein-4B"
 pipe = Flux2KleinPipeline.from_pretrained(base, torch_dtype=torch.bfloat16).to("cuda")
       primaryClass={cs.CV},
       url={https://arxiv.org/abs/2605.26108},
 }
+```