Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2601.16443

Agent Loops, Character, Work Ethics & Behavior

Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing

Paper • 2512.23611 • Published Dec 29, 2025 • 6
Context as a Tool: Context Management for Long-Horizon SWE-Agents

Paper • 2512.22087 • Published Dec 26, 2025 • 3
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22, 2025 • 61
Very Large-Scale Multi-Agent Simulation in AgentScope

Paper • 2407.17789 • Published Jul 25, 2024 • 41

Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published Jan 23 • 18
Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published Jan 28 • 21
Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 102
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published Jan 26 • 42

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Paper • 2512.24618 • Published Dec 31, 2025 • 154
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Paper • 2512.24873 • Published Dec 31, 2025 • 108
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents

Paper • 2512.23343 • Published Dec 29, 2025 • 30
Figure It Out: Improving the Frontier of Reasoning with Active Visual Thinking

Paper • 2512.24297 • Published Dec 30, 2025 • 6

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3, 2025 • 76
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 110
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 513
Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6, 2025 • 31

Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models

Paper • 2602.12036 • Published Feb 12 • 93
Reinforcement Learning for Self-Improving Agent with Skill Library

Paper • 2512.17102 • Published Dec 18, 2025 • 42
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Paper • 2512.23705 • Published Dec 29, 2025 • 45
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models

Paper • 2512.19995 • Published Dec 23, 2025 • 16

Endless Terminals

obiwan96/endless-terminals

Updated Jan 24 • 28.8k
obiwan96/qwen3-8b-openthinker-sft-endless-terminals

Text Generation • 8B • Updated Dec 29, 2025 • 11
obiwan96/qwen-2.5-7b-instruct-endless-terminals

Text Generation • 8B • Updated Dec 17, 2025 • 5
Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published Jan 23 • 18

about 4 hours ago

The Art of Scaling Reinforcement Learning Compute for LLMs

Paper • 2510.13786 • Published Oct 15, 2025 • 33
Attention Is All You Need for KV Cache in Diffusion LLMs

Paper • 2510.14973 • Published Oct 16, 2025 • 42
BitNet Distillation

Paper • 2510.13998 • Published Oct 15, 2025 • 59
GigaBrain-0: A World Model-Powered Vision-Language-Action Model

Paper • 2510.19430 • Published Oct 22, 2025 • 53

Agent Loops, Character, Work Ethics & Behavior

Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing

Paper • 2512.23611 • Published Dec 29, 2025 • 6
Context as a Tool: Context Management for Long-Horizon SWE-Agents

Paper • 2512.22087 • Published Dec 26, 2025 • 3
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22, 2025 • 61
Very Large-Scale Multi-Agent Simulation in AgentScope

Paper • 2407.17789 • Published Jul 25, 2024 • 41

Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models

Paper • 2602.12036 • Published Feb 12 • 93
Reinforcement Learning for Self-Improving Agent with Skill Library

Paper • 2512.17102 • Published Dec 18, 2025 • 42
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Paper • 2512.23705 • Published Dec 29, 2025 • 45
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models

Paper • 2512.19995 • Published Dec 23, 2025 • 16

Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published Jan 23 • 18
Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published Jan 28 • 21
Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 102
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published Jan 26 • 42

Endless Terminals

obiwan96/endless-terminals

Updated Jan 24 • 28.8k
obiwan96/qwen3-8b-openthinker-sft-endless-terminals

Text Generation • 8B • Updated Dec 29, 2025 • 11
obiwan96/qwen-2.5-7b-instruct-endless-terminals

Text Generation • 8B • Updated Dec 17, 2025 • 5
Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published Jan 23 • 18

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Paper • 2512.24618 • Published Dec 31, 2025 • 154
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Paper • 2512.24873 • Published Dec 31, 2025 • 108
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents

Paper • 2512.23343 • Published Dec 29, 2025 • 30
Figure It Out: Improving the Frontier of Reasoning with Active Visual Thinking

Paper • 2512.24297 • Published Dec 30, 2025 • 6

about 4 hours ago

The Art of Scaling Reinforcement Learning Compute for LLMs

Paper • 2510.13786 • Published Oct 15, 2025 • 33
Attention Is All You Need for KV Cache in Diffusion LLMs

Paper • 2510.14973 • Published Oct 16, 2025 • 42
BitNet Distillation

Paper • 2510.13998 • Published Oct 15, 2025 • 59
GigaBrain-0: A World Model-Powered Vision-Language-Action Model

Paper • 2510.19430 • Published Oct 22, 2025 • 53

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3, 2025 • 76
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 110
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 513
Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6, 2025 • 31

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs