stuff i never have time to read
updated
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for
Language Model Pre-training
Paper
• 2504.13161
• Published • 97
Hebbian Learning based Orthogonal Projection for Continual Learning of
Spiking Neural Networks
Paper
• 2402.11984
• Published
BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement
for Transformers in Large-Scale Time Series Modeling
Paper
• 2503.06121
• Published • 5
Timer: Transformers for Time Series Analysis at Scale
Paper
• 2402.02368
• Published • 2
Timer-XL: Long-Context Transformers for Unified Time Series Forecasting
Paper
• 2410.04803
• Published • 2
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of
Experts
Paper
• 2409.16040
• Published • 16
Packing Input Frame Context in Next-Frame Prediction Models for Video
Generation
Paper
• 2504.12626
• Published • 51
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper
• 2505.18129
• Published • 62
Ming-Omni: A Unified Multimodal Model for Perception and Generation
Paper
• 2506.09344
• Published • 31
APE: Faster and Longer Context-Augmented Generation via Adaptive
Parallel Encoding
Paper
• 2502.05431
• Published • 6
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality,
Long Context, and Next Generation Agentic Capabilities
Paper
• 2507.06261
• Published • 67
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn
Reinforcement Learning
Paper
• 2509.02544
• Published • 127
Qwen3-ASR Technical Report
Paper
• 2601.21337
• Published • 36