Less is More: Recursive Reasoning with Tiny Networks
Paper
• 2510.04871
• Published • 513
Cache-to-Cache: Direct Semantic Communication Between Large Language
Models
Paper
• 2510.03215
• Published • 99
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper
• 2510.07499
• Published • 49
StreamingVLM: Real-Time Understanding for Infinite Video Streams
Paper
• 2510.09608
• Published • 52
LiteStage: Latency-aware Layer Skipping for Multi-stage Reasoning
Paper
• 2510.14211
• Published • 9
Every Attention Matters: An Efficient Hybrid Architecture for
Long-Context Reasoning
Paper
• 2510.19338
• Published • 117
LightMem: Lightweight and Efficient Memory-Augmented Generation
Paper
• 2510.18866
• Published • 115
Glyph: Scaling Context Windows via Visual-Text Compression
Paper
• 2510.17800
• Published • 69
DeepSeek-OCR: Contexts Optical Compression
Paper
• 2510.18234
• Published • 93
Deep Self-Evolving Reasoning
Paper
• 2510.17498
• Published • 12
Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal
Reasoning in MLLMs
Paper
• 2510.24514
• Published • 22
The End of Manual Decoding: Towards Truly End-to-End Language Models
Paper
• 2510.26697
• Published • 119
Exploring Conditions for Diffusion models in Robotic Control
Paper
• 2510.15510
• Published • 40
Kimi Linear: An Expressive, Efficient Attention Architecture
Paper
• 2510.26692
• Published • 132
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta
Correction
Paper
• 2505.11254
• Published • 49