InternVL3.5 Collection This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 45 items • Updated Mar 2 • 107
SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking Paper • 2511.16618 • Published Nov 20, 2025 • 9
Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory Paper • 2507.16713 • Published Jul 22, 2025 • 21
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning Paper • 2507.16815 • Published Jul 22, 2025 • 42
Pixels, Patterns, but No Poetry: To See The World like Humans Paper • 2507.16863 • Published Jul 21, 2025 • 69
nablaNABLA: Neighborhood Adaptive Block-Level Attention Paper • 2507.13546 • Published Jul 17, 2025 • 126
VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning Paper • 2503.13444 • Published Mar 17, 2025 • 20
view article Article GaLore: Advancing Large Model Training on Consumer-grade Hardware +7 Mar 20, 2024 • 32
FLM-101B: An Open LLM and How to Train It with $100K Budget Paper • 2309.03852 • Published Sep 7, 2023 • 45
CityDreamer: Compositional Generative Model of Unbounded 3D Cities Paper • 2309.00610 • Published Sep 1, 2023 • 21
Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior Paper • 2309.00359 • Published Sep 1, 2023 • 23
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback Paper • 2309.00267 • Published Sep 1, 2023 • 53
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models Paper • 2309.00986 • Published Sep 2, 2023 • 22