-
Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics
Paper • 2512.12602 • Published • 44 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 132 -
DoPE: Denoising Rotary Position Embedding
Paper • 2511.09146 • Published • 98 -
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
Paper • 2510.25602 • Published • 80
Collections
Discover the best community collections!
Collections including paper arxiv:2512.12602
-
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection
Paper • 2512.23273 • Published • 15 -
A 58-Addition, Rank-23 Scheme for General 3x3 Matrix Multiplication
Paper • 2512.21980 • Published • 3 -
Step-DeepResearch Technical Report
Paper • 2512.20491 • Published • 87 -
SAM Audio: Segment Anything in Audio
Paper • 2512.18099 • Published • 24
-
Nuclear Norm Regularization for Deep Learning
Paper • 2405.14544 • Published • 1 -
Token embeddings violate the manifold hypothesis
Paper • 2504.01002 • Published • 1 -
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Paper • 2403.10476 • Published • 1 -
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning
Paper • 2504.00254 • Published • 1
-
Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics
Paper • 2512.12602 • Published • 44 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 132 -
DoPE: Denoising Rotary Position Embedding
Paper • 2511.09146 • Published • 98 -
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
Paper • 2510.25602 • Published • 80
-
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection
Paper • 2512.23273 • Published • 15 -
A 58-Addition, Rank-23 Scheme for General 3x3 Matrix Multiplication
Paper • 2512.21980 • Published • 3 -
Step-DeepResearch Technical Report
Paper • 2512.20491 • Published • 87 -
SAM Audio: Segment Anything in Audio
Paper • 2512.18099 • Published • 24
-
Nuclear Norm Regularization for Deep Learning
Paper • 2405.14544 • Published • 1 -
Token embeddings violate the manifold hypothesis
Paper • 2504.01002 • Published • 1 -
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Paper • 2403.10476 • Published • 1 -
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning
Paper • 2504.00254 • Published • 1