paper maybe useful
updated
Light-A-Video: Training-free Video Relighting via Progressive Light
Fusion
Paper
• 2502.08590
• Published • 42
Distillation Scaling Laws
Paper
• 2502.08606
• Published • 47
Soundwave: Less is More for Speech-Text Alignment in LLMs
Paper
• 2502.12900
• Published • 86
Alias-Free Latent Diffusion Models:Improving Fractional Shift
Equivariance of Diffusion Latent Space
Paper
• 2503.09419
• Published • 6
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
Paper
• 2503.11647
• Published • 148
Can Vision-Language Models Answer Face to Face Questions in the
Real-World?
Paper
• 2503.19356
• Published • 2
Self-Supervised Learning of Motion Concepts by Optimizing
Counterfactuals
Paper
• 2503.19953
• Published • 3
World Modeling Makes a Better Planner: Dual Preference Optimization for
Embodied Task Planning
Paper
• 2503.10480
• Published • 57
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos
via Diffusion Models
Paper
• 2503.05638
• Published • 20
Video-R1: Reinforcing Video Reasoning in MLLMs
Paper
• 2503.21776
• Published • 79
Segment Any Motion in Videos
Paper
• 2503.22268
• Published • 19
Token-Shuffle: Towards High-Resolution Image Generation with
Autoregressive Models
Paper
• 2504.17789
• Published • 23
Reinforcement Pre-Training
Paper
• 2506.08007
• Published • 265
Dreamland: Controllable World Creation with Simulator and Generative
Models
Paper
• 2506.08006
• Published • 7
Seeing Voices: Generating A-Roll Video from Audio with Mirage
Paper
• 2506.08279
• Published • 27
PlayerOne: Egocentric World Simulator
Paper
• 2506.09995
• Published • 34
Video models are zero-shot learners and reasoners
Paper
• 2509.20328
• Published • 100