One-Minute Video Generation with Test-Time Training
Paper
• 2504.05298
• Published • 110
MoCha: Towards Movie-Grade Talking Character Synthesis
Paper
• 2503.23307
• Published • 141
Towards Understanding Camera Motions in Any Video
Paper
• 2504.15376
• Published • 157
Antidistillation Sampling
Paper
• 2504.13146
• Published • 59
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through
Task Tokenization
Paper
• 2503.19901
• Published • 41
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation
with Hybrid Guidance
Paper
• 2504.01724
• Published • 68
Long Video Diffusion Generation with Segmented Cross-Attention and
Content-Rich Video Data Curation
Paper
• 2412.01316
• Published • 10
STIV: Scalable Text and Image Conditioned Video Generation
Paper
• 2412.07730
• Published • 74
VidGen-1M: A Large-Scale Dataset for Text-to-video Generation
Paper
• 2408.02629
• Published • 15
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video
Generation
Paper
• 2503.01739
• Published • 9
Video-T1: Test-Time Scaling for Video Generation
Paper
• 2503.18942
• Published • 90
VideoGuide: Improving Video Diffusion Models without Training Through a
Teacher's Guide
Paper
• 2410.04364
• Published • 29
Improving Video Generation with Human Feedback
Paper
• 2501.13918
• Published • 53
Training-free Long Video Generation with Chain of Diffusion Model
Experts
Paper
• 2408.13423
• Published • 23
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion
Generation in Video Models
Paper
• 2502.02492
• Published • 66
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human
Animation Models
Paper
• 2502.01061
• Published • 225
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising
Steps
Paper
• 2501.09732
• Published • 72
LTX-Video: Realtime Video Latent Diffusion
Paper
• 2501.00103
• Published • 50
Expanding Performance Boundaries of Open-Source Multimodal Models with
Model, Data, and Test-Time Scaling
Paper
• 2412.05271
• Published • 160
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with
Video LLM
Paper
• 2501.00599
• Published • 46
Shifting AI Efficiency From Model-Centric to Data-Centric Compression
Paper
• 2505.19147
• Published • 145
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong
Pretraining Data Selection
Paper
• 2505.07293
• Published • 28
Alchemist: Turning Public Text-to-Image Data into Generative Gold
Paper
• 2505.19297
• Published • 84
Predictive Data Selection: The Data That Predicts Is the Data That
Teaches
Paper
• 2503.00808
• Published • 57
R&B: Domain Regrouping and Data Mixture Balancing for Efficient
Foundation Model Training
Paper
• 2505.00358
• Published • 26
ICon: In-Context Contribution for Automatic Data Selection
Paper
• 2505.05327
• Published • 12
SWE-smith: Scaling Data for Software Engineering Agents
Paper
• 2504.21798
• Published • 15
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement
Learning
Paper
• 2505.24871
• Published • 23
Programming Every Example: Lifting Pre-training Data Quality like
Experts at Scale
Paper
• 2409.17115
• Published • 64