Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation Paper • 2604.13010 • Published 3 days ago • 8
Toward Autonomous Long-Horizon Engineering for ML Research Paper • 2604.13018 • Published 3 days ago • 28
ScheMatiQ: From Research Question to Structured Data through Interactive Schema Discovery Paper • 2604.09237 • Published 7 days ago • 9 • 3
The Master Key Hypothesis: Unlocking Cross-Model Capability Transfer via Linear Subspace Alignment Paper • 2604.06377 • Published 9 days ago • 7
Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference Paper • 2604.07394 • Published 9 days ago • 16
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 9 days ago • 310
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding Paper • 2603.19235 • Published 28 days ago • 95
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Paper • 2502.16894 • Published Feb 24, 2025 • 33
The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning Paper • 2604.06427 • Published 10 days ago • 11
TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders Paper • 2604.07340 • Published 9 days ago • 16
MARS: Enabling Autoregressive Models Multi-Token Generation Paper • 2604.07023 • Published 9 days ago • 38