Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2604.01658

CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Paper • 2604.01658 • Published 17 days ago • 54

Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Paper • 2510.20150 • Published Oct 23, 2025 • 6
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 134
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning

Paper • 2508.10433 • Published Aug 14, 2025 • 146
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 106

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 277
Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks

Paper • 2510.08002 • Published Oct 9, 2025 • 24
Self-Improving LLM Agents at Test-Time

Paper • 2510.07841 • Published Oct 9, 2025 • 10
The Denario project: Deep knowledge AI agents for scientific discovery

Paper • 2510.26887 • Published Oct 30, 2025 • 8

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 259 • 99
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 39
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

about 19 hours ago

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Paper • 2603.25746 • Published 24 days ago • 155
TAPS: Task Aware Proposal Distributions for Speculative Sampling

Paper • 2603.27027 • Published 22 days ago • 142
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Paper • 2603.25716 • Published 24 days ago • 154
LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published 21 days ago • 143

Agentic AI Training and Tuning

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published Oct 28, 2025 • 103
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 132
Natural-Language Agent Harnesses

Paper • 2603.25723 • Published 24 days ago • 25
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Paper • 2604.01658 • Published 17 days ago • 54

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

Paper • 2510.02209 • Published Oct 2, 2025 • 57
MM-DREX: Multimodal-Driven Dynamic Routing of LLM Experts for Financial Trading

Paper • 2509.05080 • Published Sep 5, 2025
TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis

Paper • 2508.17565 • Published Aug 25, 2025 • 1
QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning

Paper • 2508.20467 • Published Aug 28, 2025

CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Paper • 2604.01658 • Published 17 days ago • 54

about 19 hours ago

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Paper • 2603.25746 • Published 24 days ago • 155
TAPS: Task Aware Proposal Distributions for Speculative Sampling

Paper • 2603.27027 • Published 22 days ago • 142
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Paper • 2603.25716 • Published 24 days ago • 154
LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published 21 days ago • 143

Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Paper • 2510.20150 • Published Oct 23, 2025 • 6
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 134
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning

Paper • 2508.10433 • Published Aug 14, 2025 • 146
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 106

Agentic AI Training and Tuning

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published Oct 28, 2025 • 103
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 132
Natural-Language Agent Harnesses

Paper • 2603.25723 • Published 24 days ago • 25
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Paper • 2604.01658 • Published 17 days ago • 54

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 277
Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks

Paper • 2510.08002 • Published Oct 9, 2025 • 24
Self-Improving LLM Agents at Test-Time

Paper • 2510.07841 • Published Oct 9, 2025 • 10
The Denario project: Deep knowledge AI agents for scientific discovery

Paper • 2510.26887 • Published Oct 30, 2025 • 8

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

Paper • 2510.02209 • Published Oct 2, 2025 • 57
MM-DREX: Multimodal-Driven Dynamic Routing of LLM Experts for Financial Trading

Paper • 2509.05080 • Published Sep 5, 2025
TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis

Paper • 2508.17565 • Published Aug 25, 2025 • 1
QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning

Paper • 2508.20467 • Published Aug 28, 2025

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 259 • 99
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 39
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs