LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 16 days ago • 137
view article Article How I contributed a new model to the Transformers library using Codex 14 days ago • 45
Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models Paper • 2603.18002 • Published 26 days ago • 13
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning Paper • 2602.12099 • Published Feb 12 • 61
view article Article Nemotron ColEmbed V2: Raising the Bar for Multimodal Retrieval with ViDoRe V3’s Top Model Feb 4 • 28
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published Jan 29 • 74
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation Paper • 2601.02204 • Published Jan 5 • 63
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published Dec 23, 2025 • 62