Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2604.02176

about 18 hours ago

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Paper • 2603.25746 • Published 24 days ago • 155
TAPS: Task Aware Proposal Distributions for Speculative Sampling

Paper • 2603.27027 • Published 22 days ago • 142
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Paper • 2603.25716 • Published 24 days ago • 154
LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published 21 days ago • 143

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 322
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process

Paper • 2512.23988 • Published Dec 30, 2025 • 19
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

Paper • 2512.25075 • Published Dec 31, 2025 • 15
Guiding a Diffusion Transformer with the Internal Dynamics of Itself

Paper • 2512.24176 • Published Dec 30, 2025 • 8

Canvas-to-Image: Compositional Image Generation with Multimodal Controls

Paper • 2511.21691 • Published Nov 26, 2025 • 36
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory

Paper • 2511.21678 • Published Nov 26, 2025 • 12
ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction

Paper • 2511.20937 • Published Nov 26, 2025 • 16
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

Paper • 2512.10949 • Published Dec 11, 2025 • 47

BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 107
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 80
In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 43
Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 45

dLLM: Simple Diffusion Language Modeling

Paper • 2602.22661 • Published Feb 26 • 152
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data

Paper • 2603.15594 • Published Mar 16 • 149
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published Mar 11 • 153
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders

Paper • 2603.06569 • Published Mar 6 • 119

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30, 2025 • 550
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 322
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 133
LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 176

about 3 hours ago

Visual Spatial Tuning

Paper • 2511.05491 • Published Nov 7, 2025 • 53
Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published 17 days ago • 480
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published 8 days ago • 74
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published 5 days ago • 77

about 18 hours ago

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Paper • 2603.25746 • Published 24 days ago • 155
TAPS: Task Aware Proposal Distributions for Speculative Sampling

Paper • 2603.27027 • Published 22 days ago • 142
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Paper • 2603.25716 • Published 24 days ago • 154
LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published 21 days ago • 143

dLLM: Simple Diffusion Language Modeling

Paper • 2602.22661 • Published Feb 26 • 152
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data

Paper • 2603.15594 • Published Mar 16 • 149
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published Mar 11 • 153
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders

Paper • 2603.06569 • Published Mar 6 • 119

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 322
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process

Paper • 2512.23988 • Published Dec 30, 2025 • 19
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

Paper • 2512.25075 • Published Dec 31, 2025 • 15
Guiding a Diffusion Transformer with the Internal Dynamics of Itself

Paper • 2512.24176 • Published Dec 30, 2025 • 8

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30, 2025 • 550
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 322
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 133
LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 176

Canvas-to-Image: Compositional Image Generation with Multimodal Controls

Paper • 2511.21691 • Published Nov 26, 2025 • 36
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory

Paper • 2511.21678 • Published Nov 26, 2025 • 12
ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction

Paper • 2511.20937 • Published Nov 26, 2025 • 16
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

Paper • 2512.10949 • Published Dec 11, 2025 • 47

about 3 hours ago

Visual Spatial Tuning

Paper • 2511.05491 • Published Nov 7, 2025 • 53
Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published 17 days ago • 480
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published 8 days ago • 74
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published 5 days ago • 77

BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 107
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 80
In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 43
Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 45

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs