ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models Paper • 2603.19466 • Published 29 days ago • 41
Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning Paper • 2602.11149 • Published Feb 11 • 17
Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning Paper • 2507.14137 • Published Jul 18, 2025 • 36
KV Cache Steering for Inducing Reasoning in Small Language Models Paper • 2507.08799 • Published Jul 11, 2025 • 40
NeCo: Improving DINOv2's spatial representations in 19 GPU hours with Patch Neighbor Consistency Paper • 2408.11054 • Published Aug 20, 2024 • 14