Collections
Discover the best community collections!
Collections including paper arxiv:2604.01658
-
Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
Paper • 2510.20150 • Published • 6 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 134 -
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
Paper • 2508.10433 • Published • 146 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 106
-
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 277 -
Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks
Paper • 2510.08002 • Published • 24 -
Self-Improving LLM Agents at Test-Time
Paper • 2510.07841 • Published • 10 -
The Denario project: Deep knowledge AI agents for scientific discovery
Paper • 2510.26887 • Published • 8
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 259 • 99 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 39 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling
Paper • 2603.25746 • Published • 155 -
TAPS: Task Aware Proposal Distributions for Speculative Sampling
Paper • 2603.27027 • Published • 142 -
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models
Paper • 2603.25716 • Published • 154 -
LongCat-Next: Lexicalizing Modalities as Discrete Tokens
Paper • 2603.27538 • Published • 143
-
Tongyi DeepResearch Technical Report
Paper • 2510.24701 • Published • 103 -
Kimi Linear: An Expressive, Efficient Attention Architecture
Paper • 2510.26692 • Published • 132 -
Natural-Language Agent Harnesses
Paper • 2603.25723 • Published • 25 -
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery
Paper • 2604.01658 • Published • 54
-
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?
Paper • 2510.02209 • Published • 57 -
MM-DREX: Multimodal-Driven Dynamic Routing of LLM Experts for Financial Trading
Paper • 2509.05080 • Published -
TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis
Paper • 2508.17565 • Published • 1 -
QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning
Paper • 2508.20467 • Published
-
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling
Paper • 2603.25746 • Published • 155 -
TAPS: Task Aware Proposal Distributions for Speculative Sampling
Paper • 2603.27027 • Published • 142 -
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models
Paper • 2603.25716 • Published • 154 -
LongCat-Next: Lexicalizing Modalities as Discrete Tokens
Paper • 2603.27538 • Published • 143
-
Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
Paper • 2510.20150 • Published • 6 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 134 -
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
Paper • 2508.10433 • Published • 146 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 106
-
Tongyi DeepResearch Technical Report
Paper • 2510.24701 • Published • 103 -
Kimi Linear: An Expressive, Efficient Attention Architecture
Paper • 2510.26692 • Published • 132 -
Natural-Language Agent Harnesses
Paper • 2603.25723 • Published • 25 -
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery
Paper • 2604.01658 • Published • 54
-
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 277 -
Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks
Paper • 2510.08002 • Published • 24 -
Self-Improving LLM Agents at Test-Time
Paper • 2510.07841 • Published • 10 -
The Denario project: Deep knowledge AI agents for scientific discovery
Paper • 2510.26887 • Published • 8
-
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?
Paper • 2510.02209 • Published • 57 -
MM-DREX: Multimodal-Driven Dynamic Routing of LLM Experts for Financial Trading
Paper • 2509.05080 • Published -
TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis
Paper • 2508.17565 • Published • 1 -
QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning
Paper • 2508.20467 • Published
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 259 • 99 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 39 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88