Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2604.04746

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 40
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 57
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 37
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Paper • 2411.14432 • Published Nov 21, 2024 • 25

about 11 hours ago

Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning

Paper • 2604.04746 • Published 5 days ago • 61

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning

Paper • 2506.22434 • Published Jun 27, 2025 • 10
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

Paper • 2507.13348 • Published Jul 17, 2025 • 79
RewardDance: Reward Scaling in Visual Generation

Paper • 2509.08826 • Published Sep 10, 2025 • 73
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

Paper • 2510.18876 • Published Oct 21, 2025 • 37

AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation

Paper • 2602.17100 • Published Feb 19 • 4
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant

Paper • 2603.01059 • Published Mar 1 • 1
Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models

Paper • 2603.00618 • Published Feb 28
Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 193

Stuff I'm going to read

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 176
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head

Paper • 2601.07832 • Published Jan 12 • 52
Motion Attribution for Video Generation

Paper • 2601.08828 • Published Jan 13 • 71
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published Jan 27 • 26

CoLLM: A Large Language Model for Composed Image Retrieval

Paper • 2503.19910 • Published Mar 25, 2025 • 15
LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing

Paper • 2503.21541 • Published Mar 27, 2025 • 1
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration

Paper • 2504.03536 • Published Apr 4, 2025 • 13
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

Paper • 2504.04842 • Published Apr 7, 2025 • 35

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 40
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 57
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 37
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Paper • 2411.14432 • Published Nov 21, 2024 • 25

AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation

Paper • 2602.17100 • Published Feb 19 • 4
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant

Paper • 2603.01059 • Published Mar 1 • 1
Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models

Paper • 2603.00618 • Published Feb 28
Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 193

about 11 hours ago

Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning

Paper • 2604.04746 • Published 5 days ago • 61

Stuff I'm going to read

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 176
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head

Paper • 2601.07832 • Published Jan 12 • 52
Motion Attribution for Video Generation

Paper • 2601.08828 • Published Jan 13 • 71
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published Jan 27 • 26

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning

Paper • 2506.22434 • Published Jun 27, 2025 • 10
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

Paper • 2507.13348 • Published Jul 17, 2025 • 79
RewardDance: Reward Scaling in Visual Generation

Paper • 2509.08826 • Published Sep 10, 2025 • 73
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

Paper • 2510.18876 • Published Oct 21, 2025 • 37

CoLLM: A Large Language Model for Composed Image Retrieval

Paper • 2503.19910 • Published Mar 25, 2025 • 15
LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing

Paper • 2503.21541 • Published Mar 27, 2025 • 1
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration

Paper • 2504.03536 • Published Apr 4, 2025 • 13
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

Paper • 2504.04842 • Published Apr 7, 2025 • 35

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs