LPM 1.0: Video-based Character Performance Model Paper • 2604.07823 • Published 7 days ago • 68
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 10 days ago • 107
SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing Paper • 2604.04911 • Published 10 days ago • 35
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published Dec 23, 2025 • 51
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published Dec 22, 2025 • 67
Running on Zero Featured 496 Qwen Image Layered 🚀 496 Decompose images into editable layers and download them
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published Dec 9, 2025 • 134
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling Paper • 2511.20785 • Published Nov 25, 2025 • 189
Diffusion Language Models are Super Data Learners Paper • 2511.03276 • Published Nov 5, 2025 • 132
Running on Zero Featured 419 LBM Relighting ✨ 419 Fast image relighting using Latent Bridge Matching
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13, 2025 • 182
LongLive: Real-time Interactive Long Video Generation Paper • 2509.22622 • Published Sep 26, 2025 • 189
Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks Paper • 2401.14159 • Published Jan 25, 2024 • 6
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published May 20, 2025 • 134