Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2602.08234

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Paper • 2602.23008 • Published Feb 26 • 37
SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards

Paper • 2602.21158 • Published Feb 24 • 1
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published Mar 17 • 138
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Paper • 2602.08234 • Published Feb 9 • 74

Agent Knowledge

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Paper • 2602.08234 • Published Feb 9 • 74
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

Paper • 2602.10560 • Published Feb 11 • 31
SimpleMem: Efficient Lifelong Memory for LLM Agents

Paper • 2601.02553 • Published Jan 5 • 37
Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation

Paper • 2602.02007 • Published Feb 2 • 18

Data and other things

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

Paper • 2412.14475 • Published Dec 19, 2024 • 58
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 53
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
WavePulse: Real-time Content Analytics of Radio Livestreams

Paper • 2412.17998 • Published Dec 23, 2024 • 11

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

Paper • 2602.12099 • Published Feb 12 • 61
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

Paper • 2602.10560 • Published Feb 11 • 31
G-LNS: Generative Large Neighborhood Search for LLM-Based Automatic Heuristic Design

Paper • 2602.08253 • Published Feb 9 • 26
ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression

Paper • 2602.11008 • Published Feb 11 • 18

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 195
Advancing Open-source World Models

Paper • 2601.20540 • Published Jan 28 • 135
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Paper • 2512.24615 • Published Dec 31, 2025 • 119
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Paper • 2602.08234 • Published Feb 9 • 74

BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 107
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 80
In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 43
Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 45

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Paper • 2602.23008 • Published Feb 26 • 37
SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards

Paper • 2602.21158 • Published Feb 24 • 1
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published Mar 17 • 138
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Paper • 2602.08234 • Published Feb 9 • 74

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

Paper • 2602.12099 • Published Feb 12 • 61
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

Paper • 2602.10560 • Published Feb 11 • 31
G-LNS: Generative Large Neighborhood Search for LLM-Based Automatic Heuristic Design

Paper • 2602.08253 • Published Feb 9 • 26
ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression

Paper • 2602.11008 • Published Feb 11 • 18

Agent Knowledge

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Paper • 2602.08234 • Published Feb 9 • 74
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

Paper • 2602.10560 • Published Feb 11 • 31
SimpleMem: Efficient Lifelong Memory for LLM Agents

Paper • 2601.02553 • Published Jan 5 • 37
Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation

Paper • 2602.02007 • Published Feb 2 • 18

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 195
Advancing Open-source World Models

Paper • 2601.20540 • Published Jan 28 • 135
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Paper • 2512.24615 • Published Dec 31, 2025 • 119
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Paper • 2602.08234 • Published Feb 9 • 74

Data and other things

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

Paper • 2412.14475 • Published Dec 19, 2024 • 58
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 53
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
WavePulse: Real-time Content Analytics of Radio Livestreams

Paper • 2412.17998 • Published Dec 23, 2024 • 11

BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 107
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 80
In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 43
Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 45

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs