-
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation
Paper • 2601.17737 • Published • 56 -
Advancing Open-source World Models
Paper • 2601.20540 • Published • 135 -
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
Paper • 2601.15369 • Published • 21 -
Video-As-Prompt: Unified Semantic Control for Video Generation
Paper • 2510.20888 • Published • 50
Collections
Discover the best community collections!
Collections including paper arxiv:2602.02437
-
yandex/stable-diffusion-3.5-medium-alchemist
Text-to-Image • Updated • 16 • 7 -
Ovis-U1 Technical Report
Paper • 2506.23044 • Published • 61 -
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Paper • 2507.01953 • Published • 18 -
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory
Paper • 2507.01945 • Published • 76
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 80 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45
-
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Paper • 2506.07977 • Published • 40 -
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Paper • 2506.07986 • Published • 19 -
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Paper • 2506.06276 • Published • 26 -
Aligning Latent Spaces with Flow Priors
Paper • 2506.05240 • Published • 27
-
BrushEdit: All-In-One Image Inpainting and Editing
Paper • 2412.10316 • Published • 36 -
ColorFlow: Retrieval-Augmented Image Sequence Colorization
Paper • 2412.11815 • Published • 26 -
FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers
Paper • 2412.09611 • Published • 11 -
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing
Paper • 2412.07517 • Published • 11
-
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation
Paper • 2601.17737 • Published • 56 -
Advancing Open-source World Models
Paper • 2601.20540 • Published • 135 -
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
Paper • 2601.15369 • Published • 21 -
Video-As-Prompt: Unified Semantic Control for Video Generation
Paper • 2510.20888 • Published • 50
-
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Paper • 2506.07977 • Published • 40 -
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Paper • 2506.07986 • Published • 19 -
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Paper • 2506.06276 • Published • 26 -
Aligning Latent Spaces with Flow Priors
Paper • 2506.05240 • Published • 27
-
yandex/stable-diffusion-3.5-medium-alchemist
Text-to-Image • Updated • 16 • 7 -
Ovis-U1 Technical Report
Paper • 2506.23044 • Published • 61 -
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Paper • 2507.01953 • Published • 18 -
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory
Paper • 2507.01945 • Published • 76
-
BrushEdit: All-In-One Image Inpainting and Editing
Paper • 2412.10316 • Published • 36 -
ColorFlow: Retrieval-Augmented Image Sequence Colorization
Paper • 2412.11815 • Published • 26 -
FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers
Paper • 2412.09611 • Published • 11 -
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing
Paper • 2412.07517 • Published • 11
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 80 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45