internlm
/

ETCHR-FLUX.2-klein-9B

Diffusion Single File

Flux2KleinPipeline

image-generation

Model card Files Files and versions

yuhangzang commited on 4 days ago

Commit

ed62caa

·

verified ·

1 Parent(s): f75d711

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -16,14 +16,14 @@ library_name: diffusers
 ---
 # ETCHR-FLUX.2-klein-9B
-  <p align="center" style="font-size: 1.2em; margin-top: 0.5em">
-    📖<a href="https://arxiv.org/abs/">Paper</a>
   | 🏠<a href="https://github.com/InternLM/ETCHR">Homepage</a >
   | 🤗<a href="https://huggingface.co/internlm/ETCHR-FLUX.2-klein-9B">ETCHR-FLUX.2-klein-9B Model</a >
   | 🤗<a href="https://huggingface.co/datasets/internlm/ETCHR-SFT-400K">ETCHR SFT-400K Dataset</a >
   | 🤗<a href="https://huggingface.co/datasets/internlm/ETCHR-GRPO-10K">ETCHR GRPO-10K Dataset</a >
   | 🤗<a href="https://huggingface.co/datasets/internlm/DL3DV-2k">DL3DV-2K Benchmark</a >
-  </p >
 ETCHR-FLUX.2-klein-9B is a novel question-conditioned, reasoning-aware image editor designed to serve as a decoupled visual reasoning assistant for Multimodal Large Language Models. By decoupling the specialized image editor from the downstream understanding model, ETCHR bridges the critical bottleneck where a purely textual chain of thought fails in fine-grained focus or complex spatial transformations.

 ---
 # ETCHR-FLUX.2-klein-9B
+  📖<a href="https://arxiv.org/abs/">Paper</a>
   | 🏠<a href="https://github.com/InternLM/ETCHR">Homepage</a >
   | 🤗<a href="https://huggingface.co/internlm/ETCHR-FLUX.2-klein-9B">ETCHR-FLUX.2-klein-9B Model</a >
   | 🤗<a href="https://huggingface.co/datasets/internlm/ETCHR-SFT-400K">ETCHR SFT-400K Dataset</a >
   | 🤗<a href="https://huggingface.co/datasets/internlm/ETCHR-GRPO-10K">ETCHR GRPO-10K Dataset</a >
   | 🤗<a href="https://huggingface.co/datasets/internlm/DL3DV-2k">DL3DV-2K Benchmark</a >
 ETCHR-FLUX.2-klein-9B is a novel question-conditioned, reasoning-aware image editor designed to serve as a decoupled visual reasoning assistant for Multimodal Large Language Models. By decoupling the specialized image editor from the downstream understanding model, ETCHR bridges the critical bottleneck where a purely textual chain of thought fails in fine-grained focus or complex spatial transformations.