-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper β’ 2401.09048 β’ Published β’ 10 -
Improving fine-grained understanding in image-text pre-training
Paper β’ 2401.09865 β’ Published β’ 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper β’ 2401.10891 β’ Published β’ 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper β’ 2401.13627 β’ Published β’ 78
Collections
Discover the best community collections!
Collections including paper arxiv:2503.09641
-
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 42 -
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield
Paper β’ 2511.22677 β’ Published β’ 35 -
One-step Diffusion with Distribution Matching Distillation
Paper β’ 2311.18828 β’ Published β’ 3 -
Improved Distribution Matching Distillation for Fast Image Synthesis
Paper β’ 2405.14867 β’ Published β’ 15
-
SanaSprint
π415Ultra fast high quality image generation
-
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 42 -
Efficient-Large-Model/Sana_Sprint_1.6B_1024px
Text-to-Image β’ Updated β’ 21 β’ 17 -
Efficient-Large-Model/Sana_Sprint_0.6B_1024px
Text-to-Image β’ Updated β’ 20 β’ 7
-
CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation
Paper β’ 2502.08639 β’ Published β’ 43 -
TransMLA: Multi-head Latent Attention Is All You Need
Paper β’ 2502.07864 β’ Published β’ 69 -
Next Block Prediction: Video Generation via Semi-Autoregressive Modeling
Paper β’ 2502.07737 β’ Published β’ 9 -
Enhance-A-Video: Better Generated Video for Free
Paper β’ 2502.07508 β’ Published β’ 21
-
One-step Diffusion Models with f-Divergence Distribution Matching
Paper β’ 2502.15681 β’ Published β’ 8 -
Adversarial Diffusion Distillation
Paper β’ 2311.17042 β’ Published β’ 3 -
Consistency Models
Paper β’ 2303.01469 β’ Published β’ 8 -
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Paper β’ 2307.01952 β’ Published β’ 92
-
Tongyi-MAI/Z-Image-Turbo
Text-to-Image β’ Updated β’ 1.17M β’ β’ 4.48k -
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 42 -
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Paper β’ 2410.10629 β’ Published β’ 13 -
Efficient Distillation of Classifier-Free Guidance using Adapters
Paper β’ 2503.07274 β’ Published β’ 4
-
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
Paper β’ 2503.15265 β’ Published β’ 46 -
What matters when building vision-language models?
Paper β’ 2405.02246 β’ Published β’ 104 -
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 42
-
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Paper β’ 2501.18427 β’ Published β’ 25 -
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Paper β’ 2502.20388 β’ Published β’ 16 -
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 42 -
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Paper β’ 2503.16430 β’ Published β’ 34
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper β’ 2401.09048 β’ Published β’ 10 -
Improving fine-grained understanding in image-text pre-training
Paper β’ 2401.09865 β’ Published β’ 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper β’ 2401.10891 β’ Published β’ 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper β’ 2401.13627 β’ Published β’ 78
-
One-step Diffusion Models with f-Divergence Distribution Matching
Paper β’ 2502.15681 β’ Published β’ 8 -
Adversarial Diffusion Distillation
Paper β’ 2311.17042 β’ Published β’ 3 -
Consistency Models
Paper β’ 2303.01469 β’ Published β’ 8 -
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Paper β’ 2307.01952 β’ Published β’ 92
-
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 42 -
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield
Paper β’ 2511.22677 β’ Published β’ 35 -
One-step Diffusion with Distribution Matching Distillation
Paper β’ 2311.18828 β’ Published β’ 3 -
Improved Distribution Matching Distillation for Fast Image Synthesis
Paper β’ 2405.14867 β’ Published β’ 15
-
Tongyi-MAI/Z-Image-Turbo
Text-to-Image β’ Updated β’ 1.17M β’ β’ 4.48k -
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 42 -
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Paper β’ 2410.10629 β’ Published β’ 13 -
Efficient Distillation of Classifier-Free Guidance using Adapters
Paper β’ 2503.07274 β’ Published β’ 4
-
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
Paper β’ 2503.15265 β’ Published β’ 46 -
What matters when building vision-language models?
Paper β’ 2405.02246 β’ Published β’ 104 -
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 42
-
SanaSprint
π415Ultra fast high quality image generation
-
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 42 -
Efficient-Large-Model/Sana_Sprint_1.6B_1024px
Text-to-Image β’ Updated β’ 21 β’ 17 -
Efficient-Large-Model/Sana_Sprint_0.6B_1024px
Text-to-Image β’ Updated β’ 20 β’ 7
-
CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation
Paper β’ 2502.08639 β’ Published β’ 43 -
TransMLA: Multi-head Latent Attention Is All You Need
Paper β’ 2502.07864 β’ Published β’ 69 -
Next Block Prediction: Video Generation via Semi-Autoregressive Modeling
Paper β’ 2502.07737 β’ Published β’ 9 -
Enhance-A-Video: Better Generated Video for Free
Paper β’ 2502.07508 β’ Published β’ 21
-
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Paper β’ 2501.18427 β’ Published β’ 25 -
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Paper β’ 2502.20388 β’ Published β’ 16 -
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 42 -
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Paper β’ 2503.16430 β’ Published β’ 34