Collections
Discover the best community collections!
Collections including paper arxiv:2601.21468
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 322 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 19 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 15 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
PubTables-1M: Towards comprehensive table extraction from unstructured documents
Paper • 2110.00061 • Published • 3 -
Optimized Table Tokenization for Table Structure Recognition
Paper • 2305.03393 • Published • 1 -
Qwen3-VL Technical Report
Paper • 2511.21631 • Published • 161 -
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model
Paper • 2510.14528 • Published • 124
-
LongRoPE2: Near-Lossless LLM Context Window Scaling
Paper • 2502.20082 • Published • 36 -
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning
Paper • 2601.21468 • Published • 25 -
Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents
Paper • 2509.23040 • Published • 12
-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 93 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 42 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
-
Self-Distillation Enables Continual Learning
Paper • 2601.19897 • Published • 29 -
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning
Paper • 2601.21468 • Published • 25 -
Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents
Paper • 2509.23040 • Published • 12
-
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding
Paper • 2503.13964 • Published • 20 -
RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training
Paper • 2510.06710 • Published • 43 -
VIDEOP2R: Video Understanding from Perception to Reasoning
Paper • 2511.11113 • Published • 112 -
Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models
Paper • 2512.04981 • Published • 9
-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 93 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 42 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 322 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 19 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 15 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
Self-Distillation Enables Continual Learning
Paper • 2601.19897 • Published • 29 -
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning
Paper • 2601.21468 • Published • 25 -
Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents
Paper • 2509.23040 • Published • 12
-
PubTables-1M: Towards comprehensive table extraction from unstructured documents
Paper • 2110.00061 • Published • 3 -
Optimized Table Tokenization for Table Structure Recognition
Paper • 2305.03393 • Published • 1 -
Qwen3-VL Technical Report
Paper • 2511.21631 • Published • 161 -
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model
Paper • 2510.14528 • Published • 124
-
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding
Paper • 2503.13964 • Published • 20 -
RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training
Paper • 2510.06710 • Published • 43 -
VIDEOP2R: Video Understanding from Perception to Reasoning
Paper • 2511.11113 • Published • 112 -
Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models
Paper • 2512.04981 • Published • 9
-
LongRoPE2: Near-Lossless LLM Context Window Scaling
Paper • 2502.20082 • Published • 36 -
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning
Paper • 2601.21468 • Published • 25 -
Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents
Paper • 2509.23040 • Published • 12