Gen-Searcher: Reinforcing Agentic Search for Image Generation Paper • 2603.28767 • Published 17 days ago • 57
UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation Paper • 2603.23500 • Published 23 days ago • 35
Manifold-Aware Exploration for Reinforcement Learning in Video Generation Paper • 2603.21872 • Published 24 days ago • 33
iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper • 2601.17124 • Published Jan 23 • 33
iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper • 2601.17124 • Published Jan 23 • 33
JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization Paper • 2511.23002 • Published Nov 28, 2025 • 26
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 265
MultiShotMaster: A Controllable Multi-Shot Video Generation Framework Paper • 2512.03041 • Published Dec 2, 2025 • 65
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? Paper • 2412.02611 • Published Dec 3, 2024 • 25
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level Paper • 2406.11817 • Published Jun 17, 2024 • 13
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities Paper • 2401.14405 • Published Jan 25, 2024 • 13
Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors Paper • 2312.04963 • Published Dec 7, 2023 • 17
OneLLM: One Framework to Align All Modalities with Language Paper • 2312.03700 • Published Dec 6, 2023 • 24