Collections
Discover the best community collections!
Collections including paper arxiv:2601.10477
-
moonshotai/Kimi-K2-Instruct-0905
Text Generation • 1T • Updated • 368k • • 698 -
Xxx Wacth Videos Fb Id 1000908070605040302010
👁1Watch videos from a specific Facebook ID
-
InferenceSupport
💥504Discussions about the Inference Providers feature on the Hub
-
facebook/sam3
Mask Generation • 0.9B • Updated • 2.14M • 1.89k
-
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper • 2508.05748 • Published • 142 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 140 -
SpatialLM: Training Large Language Models for Structured Indoor Modeling
Paper • 2506.07491 • Published • 51 -
LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Paper • 2508.14041 • Published • 59
-
InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning
Paper • 2502.11573 • Published • 9 -
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Paper • 2502.02339 • Published • 23 -
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Paper • 2502.11775 • Published • 9 -
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39
-
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models
Paper • 2601.23143 • Published • 39 -
PaperBanana: Automating Academic Illustration for AI Scientists
Paper • 2601.23265 • Published • 223 -
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
BabyVision: Visual Reasoning Beyond Language
Paper • 2601.06521 • Published • 201
-
VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation
Paper • 2601.10124 • Published • 4 -
Urban Socio-Semantic Segmentation with Vision-Language Reasoning
Paper • 2601.10477 • Published • 156 -
Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation
Paper • 2601.10880 • Published • 15 -
SAMTok: Representing Any Mask with Two Words
Paper • 2601.16093 • Published • 43
-
ReXGroundingCT: A 3D Chest CT Dataset for Segmentation of Findings from Free-Text Reports
Paper • 2507.22030 • Published • 4 -
Unlocking the Potential of MLLMs in Referring Expression Segmentation via a Light-weight Mask Decode
Paper • 2508.04107 • Published • 4 -
Phrase-grounded Fact-checking for Automatically Generated Chest X-ray Reports
Paper • 2509.21356 • Published -
Learning Segmentation from Radiology Reports
Paper • 2507.05582 • Published • 1
-
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Paper • 2411.04952 • Published • 29 -
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models
Paper • 2411.05005 • Published • 13 -
M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models
Paper • 2411.04075 • Published • 16 -
Self-Consistency Preference Optimization
Paper • 2411.04109 • Published • 19
-
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models
Paper • 2601.23143 • Published • 39 -
PaperBanana: Automating Academic Illustration for AI Scientists
Paper • 2601.23265 • Published • 223 -
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
BabyVision: Visual Reasoning Beyond Language
Paper • 2601.06521 • Published • 201
-
moonshotai/Kimi-K2-Instruct-0905
Text Generation • 1T • Updated • 368k • • 698 -
Xxx Wacth Videos Fb Id 1000908070605040302010
👁1Watch videos from a specific Facebook ID
-
InferenceSupport
💥504Discussions about the Inference Providers feature on the Hub
-
facebook/sam3
Mask Generation • 0.9B • Updated • 2.14M • 1.89k
-
VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation
Paper • 2601.10124 • Published • 4 -
Urban Socio-Semantic Segmentation with Vision-Language Reasoning
Paper • 2601.10477 • Published • 156 -
Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation
Paper • 2601.10880 • Published • 15 -
SAMTok: Representing Any Mask with Two Words
Paper • 2601.16093 • Published • 43
-
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper • 2508.05748 • Published • 142 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 140 -
SpatialLM: Training Large Language Models for Structured Indoor Modeling
Paper • 2506.07491 • Published • 51 -
LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Paper • 2508.14041 • Published • 59
-
ReXGroundingCT: A 3D Chest CT Dataset for Segmentation of Findings from Free-Text Reports
Paper • 2507.22030 • Published • 4 -
Unlocking the Potential of MLLMs in Referring Expression Segmentation via a Light-weight Mask Decode
Paper • 2508.04107 • Published • 4 -
Phrase-grounded Fact-checking for Automatically Generated Chest X-ray Reports
Paper • 2509.21356 • Published -
Learning Segmentation from Radiology Reports
Paper • 2507.05582 • Published • 1
-
InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning
Paper • 2502.11573 • Published • 9 -
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Paper • 2502.02339 • Published • 23 -
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Paper • 2502.11775 • Published • 9 -
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39
-
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Paper • 2411.04952 • Published • 29 -
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models
Paper • 2411.05005 • Published • 13 -
M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models
Paper • 2411.04075 • Published • 16 -
Self-Consistency Preference Optimization
Paper • 2411.04109 • Published • 19