Collections
Discover the best community collections!
Collections including paper arxiv:2605.27365
-
Fast-SAM3D: 3Dfy Anything in Images but Faster
Paper • 2602.05293 • Published • 2 -
Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching
Paper • 2602.12280 • Published • 34 -
CADEvolve: Creating Realistic CAD via Program Evolution
Paper • 2602.16317 • Published • 30 -
SketchDynamics: Exploring Free-Form Sketches for Dynamic Intent Expression in Animation Generation
Paper • 2601.20622 • Published • 2
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 61 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 53 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 45 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 64
-
FAN: Fourier Analysis Networks
Paper • 2410.02675 • Published • 29 -
Tensor Product Attention Is All You Need
Paper • 2501.06425 • Published • 91 -
Scalable-Softmax Is Superior for Attention
Paper • 2501.19399 • Published • 25 -
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Paper • 2502.09509 • Published • 9
-
Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video
Paper • 2605.15182 • Published • 39 -
STALE: Can LLM Agents Know When Their Memories Are No Longer Valid?
Paper • 2605.06527 • Published • 44 -
Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis
Paper • 2605.14392 • Published • 8 -
World Action Models: The Next Frontier in Embodied AI
Paper • 2605.12090 • Published • 66
-
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
Paper • 2506.22434 • Published • 10 -
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning
Paper • 2507.13348 • Published • 79 -
RewardDance: Reward Scaling in Visual Generation
Paper • 2509.08826 • Published • 73 -
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
Paper • 2510.18876 • Published • 37
-
FAN: Fourier Analysis Networks
Paper • 2410.02675 • Published • 29 -
Tensor Product Attention Is All You Need
Paper • 2501.06425 • Published • 91 -
Scalable-Softmax Is Superior for Attention
Paper • 2501.19399 • Published • 25 -
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Paper • 2502.09509 • Published • 9
-
Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video
Paper • 2605.15182 • Published • 39 -
STALE: Can LLM Agents Know When Their Memories Are No Longer Valid?
Paper • 2605.06527 • Published • 44 -
Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis
Paper • 2605.14392 • Published • 8 -
World Action Models: The Next Frontier in Embodied AI
Paper • 2605.12090 • Published • 66
-
Fast-SAM3D: 3Dfy Anything in Images but Faster
Paper • 2602.05293 • Published • 2 -
Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching
Paper • 2602.12280 • Published • 34 -
CADEvolve: Creating Realistic CAD via Program Evolution
Paper • 2602.16317 • Published • 30 -
SketchDynamics: Exploring Free-Form Sketches for Dynamic Intent Expression in Animation Generation
Paper • 2601.20622 • Published • 2
-
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
Paper • 2506.22434 • Published • 10 -
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning
Paper • 2507.13348 • Published • 79 -
RewardDance: Reward Scaling in Visual Generation
Paper • 2509.08826 • Published • 73 -
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
Paper • 2510.18876 • Published • 37
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 61 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 53 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 45 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 64