-
Arbitrary-steps Image Super-resolution via Diffusion Inversion
Paper • 2412.09013 • Published • 13 -
Deep Researcher with Test-Time Diffusion
Paper • 2507.16075 • Published • 68 -
nablaNABLA: Neighborhood Adaptive Block-Level Attention
Paper • 2507.13546 • Published • 126 -
Yume: An Interactive World Generation Model
Paper • 2507.17744 • Published • 92
Collections
Discover the best community collections!
Collections including paper arxiv:2412.09013
-
Arbitrary-steps Image Super-resolution via Diffusion Inversion
Paper • 2412.09013 • Published • 13 -
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration
Paper • 2501.01320 • Published • 12 -
V-Bridge: Bridging Video Generative Priors to Versatile Few-shot Image Restoration
Paper • 2603.13089 • Published • 13
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 30 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 15 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
Arbitrary-steps Image Super-resolution via Diffusion Inversion
Paper • 2412.09013 • Published • 13 -
Deep Researcher with Test-Time Diffusion
Paper • 2507.16075 • Published • 68 -
nablaNABLA: Neighborhood Adaptive Block-Level Attention
Paper • 2507.13546 • Published • 126 -
Yume: An Interactive World Generation Model
Paper • 2507.17744 • Published • 92
-
Arbitrary-steps Image Super-resolution via Diffusion Inversion
Paper • 2412.09013 • Published • 13 -
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration
Paper • 2501.01320 • Published • 12 -
V-Bridge: Bridging Video Generative Priors to Versatile Few-shot Image Restoration
Paper • 2603.13089 • Published • 13
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 30 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 15 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23