Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2511.10555

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 18
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 78

Arbitrary-steps Image Super-resolution via Diffusion Inversion

Paper • 2412.09013 • Published Dec 12, 2024 • 13
Deep Researcher with Test-Time Diffusion

Paper • 2507.16075 • Published Jul 21, 2025 • 68
nablaNABLA: Neighborhood Adaptive Block-Level Attention

Paper • 2507.13546 • Published Jul 17, 2025 • 126
Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published Jul 23, 2025 • 92

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 30
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 15
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Boosting Latent Diffusion Models via Disentangled Representation Alignment

Paper • 2601.05823 • Published Jan 9 • 17
VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction

Paper • 2511.23386 • Published Nov 28, 2025 • 16
Visual Generation Tuning

Paper • 2511.23469 • Published Nov 28, 2025 • 16
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space

Paper • 2511.10555 • Published Nov 13, 2025 • 63

Diffusion Model Control

Control Methods for Diffusion and Score Models

LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

Paper • 2412.09622 • Published Dec 12, 2024 • 8
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models

Paper • 2412.04146 • Published Dec 5, 2024 • 23
Learning Flow Fields in Attention for Controllable Person Image Generation

Paper • 2412.08486 • Published Dec 11, 2024 • 36
LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation

Paper • 2412.05148 • Published Dec 6, 2024 • 12

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 18
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 78

Boosting Latent Diffusion Models via Disentangled Representation Alignment

Paper • 2601.05823 • Published Jan 9 • 17
VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction

Paper • 2511.23386 • Published Nov 28, 2025 • 16
Visual Generation Tuning

Paper • 2511.23469 • Published Nov 28, 2025 • 16
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space

Paper • 2511.10555 • Published Nov 13, 2025 • 63

Arbitrary-steps Image Super-resolution via Diffusion Inversion

Paper • 2412.09013 • Published Dec 12, 2024 • 13
Deep Researcher with Test-Time Diffusion

Paper • 2507.16075 • Published Jul 21, 2025 • 68
nablaNABLA: Neighborhood Adaptive Block-Level Attention

Paper • 2507.13546 • Published Jul 17, 2025 • 126
Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published Jul 23, 2025 • 92

Diffusion Model Control

Control Methods for Diffusion and Score Models

LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

Paper • 2412.09622 • Published Dec 12, 2024 • 8
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models

Paper • 2412.04146 • Published Dec 5, 2024 • 23
Learning Flow Fields in Attention for Controllable Person Image Generation

Paper • 2412.08486 • Published Dec 11, 2024 • 36
LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation

Paper • 2412.05148 • Published Dec 6, 2024 • 12

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 30
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 15
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs