-
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 176 -
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head
Paper • 2601.07832 • Published • 52 -
Motion Attribution for Video Generation
Paper • 2601.08828 • Published • 72 -
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep
Paper • 2601.19895 • Published • 27
Collections
Discover the best community collections!
Collections including paper arxiv:2602.12675
-
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation
Paper • 2505.18875 • Published • 42 -
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models
Paper • 2506.16054 • Published • 60 -
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Paper • 2410.02367 • Published • 50 -
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation
Paper • 2506.19852 • Published • 42
-
SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration
Paper • 2411.10958 • Published • 57 -
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference
Paper • 2502.18137 • Published • 60 -
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training
Paper • 2505.11594 • Published • 75 -
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Paper • 2410.02367 • Published • 50
-
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 106 -
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 99 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 66
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 302 -
Lizard: An Efficient Linearization Framework for Large Language Models
Paper • 2507.09025 • Published • 19 -
On the Expressiveness of Softmax Attention: A Recurrent Neural Network Perspective
Paper • 2507.23632 • Published • 6 -
Causal Attention with Lookahead Keys
Paper • 2509.07301 • Published • 21
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 176 -
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head
Paper • 2601.07832 • Published • 52 -
Motion Attribution for Video Generation
Paper • 2601.08828 • Published • 72 -
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep
Paper • 2601.19895 • Published • 27
-
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 106 -
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 99 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 66
-
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation
Paper • 2505.18875 • Published • 42 -
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models
Paper • 2506.16054 • Published • 60 -
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Paper • 2410.02367 • Published • 50 -
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation
Paper • 2506.19852 • Published • 42
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 302 -
Lizard: An Efficient Linearization Framework for Large Language Models
Paper • 2507.09025 • Published • 19 -
On the Expressiveness of Softmax Attention: A Recurrent Neural Network Perspective
Paper • 2507.23632 • Published • 6 -
Causal Attention with Lookahead Keys
Paper • 2509.07301 • Published • 21
-
SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration
Paper • 2411.10958 • Published • 57 -
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference
Paper • 2502.18137 • Published • 60 -
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training
Paper • 2505.11594 • Published • 75 -
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Paper • 2410.02367 • Published • 50
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25