Collections
Discover the best community collections!
Collections including paper arxiv:2603.20278
-
Efficient Agents: Building Effective Agents While Reducing Cost
Paper • 2508.02694 • Published • 86 -
Agentic AI Frameworks: Architectures, Protocols, and Design Challenges
Paper • 2508.10146 • Published -
Kimi K2.5: Visual Agentic Intelligence
Paper • 2602.02276 • Published • 264 -
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas
Paper • 2601.21558 • Published • 60
-
openai/gpt-oss-120b
Text Generation • 120B • Updated • 3.49M • • 4.71k -
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning
Paper • 2512.20605 • Published • 62 -
Nested Browser-Use Learning for Agentic Information Seeking
Paper • 2512.23647 • Published • 19 -
TimeBill: Time-Budgeted Inference for Large Language Models
Paper • 2512.21859 • Published • 25
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 208 • 99 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 39 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis
Paper • 2603.20278 • Published • 94 -
Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought
Paper • 2603.22847 • Published • 26 -
Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory
Paper • 2604.01007 • Published • 31
-
FlowRL: Matching Reward Distributions for LLM Reasoning
Paper • 2509.15207 • Published • 118 -
Kwaipilot/KAT-Dev-72B-Exp
Text Generation • 73B • Updated • 27 • 157 -
Agentic Entropy-Balanced Policy Optimization
Paper • 2510.14545 • Published • 108 -
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Paper • 2511.13288 • Published • 19
-
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation
Paper • 2506.09790 • Published • 53 -
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance
Paper • 2506.06444 • Published • 73 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 74 -
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
Paper • 2502.04644 • Published • 4
-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 63 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53
-
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis
Paper • 2603.20278 • Published • 94 -
Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought
Paper • 2603.22847 • Published • 26 -
Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory
Paper • 2604.01007 • Published • 31
-
Efficient Agents: Building Effective Agents While Reducing Cost
Paper • 2508.02694 • Published • 86 -
Agentic AI Frameworks: Architectures, Protocols, and Design Challenges
Paper • 2508.10146 • Published -
Kimi K2.5: Visual Agentic Intelligence
Paper • 2602.02276 • Published • 264 -
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas
Paper • 2601.21558 • Published • 60
-
FlowRL: Matching Reward Distributions for LLM Reasoning
Paper • 2509.15207 • Published • 118 -
Kwaipilot/KAT-Dev-72B-Exp
Text Generation • 73B • Updated • 27 • 157 -
Agentic Entropy-Balanced Policy Optimization
Paper • 2510.14545 • Published • 108 -
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Paper • 2511.13288 • Published • 19
-
openai/gpt-oss-120b
Text Generation • 120B • Updated • 3.49M • • 4.71k -
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning
Paper • 2512.20605 • Published • 62 -
Nested Browser-Use Learning for Agentic Information Seeking
Paper • 2512.23647 • Published • 19 -
TimeBill: Time-Budgeted Inference for Large Language Models
Paper • 2512.21859 • Published • 25
-
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation
Paper • 2506.09790 • Published • 53 -
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance
Paper • 2506.06444 • Published • 73 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 74 -
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
Paper • 2502.04644 • Published • 4
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 208 • 99 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 39 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 63 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53