Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2511.22677

Read But Not Implemented

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published Dec 18, 2025 • 97
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 245
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Paper • 2512.16676 • Published Dec 18, 2025 • 222
Sharp Monocular View Synthesis in Less Than a Second

Paper • 2512.10685 • Published Dec 11, 2025 • 29

Interesting papers regarding Diffusion architectures

Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Paper • 2511.22677 • Published Nov 27, 2025 • 35
DiP: Taming Diffusion Models in Pixel Space

Paper • 2511.18822 • Published Nov 24, 2025 • 29
What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards

Paper • 2512.00425 • Published Nov 29, 2025 • 53
Learning Eigenstructures of Unstructured Data Manifolds

Paper • 2512.01103 • Published Nov 30, 2025 • 6

SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation

Paper • 2503.09641 • Published Mar 12, 2025 • 42
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Paper • 2511.22677 • Published Nov 27, 2025 • 35
One-step Diffusion with Distribution Matching Distillation

Paper • 2311.18828 • Published Nov 30, 2023 • 3
Improved Distribution Matching Distillation for Fast Image Synthesis

Paper • 2405.14867 • Published May 23, 2024 • 15

Running

Agents

9

Hot Or Not

🏢

9

Evaluate hotness, beauty, and attractiveness of an image
Runtime error

Agents

Featured

816

Audioldm Text To Audio Generation

🔊

816

Generate audio from text descriptions
openai/whisper-large-v3

Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 4.86M • • 5.6k
Running on Zero

MCP

Featured

827

Whisper Large V3

🤫

827

Transcribe or translate audio and YouTube videos to text

Visual (image/video) Generation

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 245
REASONEDIT: Towards Reasoning-Enhanced Image Editing Models

Paper • 2511.22625 • Published Nov 27, 2025 • 48
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Paper • 2511.22677 • Published Nov 27, 2025 • 35
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published Dec 4, 2025 • 177

FastVLM: Efficient Vision Encoding for Vision Language Models

Paper • 2412.13303 • Published Dec 17, 2024 • 75
rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28, 2025 • 118
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22, 2025 • 61
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Paper • 2509.12201 • Published Sep 15, 2025 • 107

Drivable 3D Gaussian Avatars

Paper • 2311.08581 • Published Nov 14, 2023 • 47
Single-Image 3D Human Digitization with Shape-Guided Diffusion

Paper • 2311.09221 • Published Nov 15, 2023 • 22
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion

Paper • 2311.07885 • Published Nov 14, 2023 • 40
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Paper • 2511.22677 • Published Nov 27, 2025 • 35

Read But Not Implemented

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published Dec 18, 2025 • 97
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 245
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Paper • 2512.16676 • Published Dec 18, 2025 • 222
Sharp Monocular View Synthesis in Less Than a Second

Paper • 2512.10685 • Published Dec 11, 2025 • 29

Visual (image/video) Generation

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 245
REASONEDIT: Towards Reasoning-Enhanced Image Editing Models

Paper • 2511.22625 • Published Nov 27, 2025 • 48
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Paper • 2511.22677 • Published Nov 27, 2025 • 35
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published Dec 4, 2025 • 177

Interesting papers regarding Diffusion architectures

Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Paper • 2511.22677 • Published Nov 27, 2025 • 35
DiP: Taming Diffusion Models in Pixel Space

Paper • 2511.18822 • Published Nov 24, 2025 • 29
What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards

Paper • 2512.00425 • Published Nov 29, 2025 • 53
Learning Eigenstructures of Unstructured Data Manifolds

Paper • 2512.01103 • Published Nov 30, 2025 • 6

FastVLM: Efficient Vision Encoding for Vision Language Models

Paper • 2412.13303 • Published Dec 17, 2024 • 75
rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28, 2025 • 118
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22, 2025 • 61
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Paper • 2509.12201 • Published Sep 15, 2025 • 107

SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation

Paper • 2503.09641 • Published Mar 12, 2025 • 42
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Paper • 2511.22677 • Published Nov 27, 2025 • 35
One-step Diffusion with Distribution Matching Distillation

Paper • 2311.18828 • Published Nov 30, 2023 • 3
Improved Distribution Matching Distillation for Fast Image Synthesis

Paper • 2405.14867 • Published May 23, 2024 • 15

Drivable 3D Gaussian Avatars

Paper • 2311.08581 • Published Nov 14, 2023 • 47
Single-Image 3D Human Digitization with Shape-Guided Diffusion

Paper • 2311.09221 • Published Nov 15, 2023 • 22
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion

Paper • 2311.07885 • Published Nov 14, 2023 • 40
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Paper • 2511.22677 • Published Nov 27, 2025 • 35

Running

Agents

9

Hot Or Not

🏢

9

Evaluate hotness, beauty, and attractiveness of an image
Runtime error

Agents

Featured

816

Audioldm Text To Audio Generation

🔊

816

Generate audio from text descriptions
openai/whisper-large-v3

Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 4.86M • • 5.6k
Running on Zero

MCP

Featured

827

Whisper Large V3

🤫

827

Transcribe or translate audio and YouTube videos to text

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs