-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 93 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 42 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
Collections
Discover the best community collections!
Collections including paper arxiv:2602.12036
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 18 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 102 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 42
-
MinerU Document Extraction Tools
📚582Easy converting PDF and Office docs into Markdown and JSON
-
RAG-Anything: All-in-One RAG Framework
Paper • 2510.12323 • Published • 73 -
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 93 -
Chain of Mindset: Reasoning with Adaptive Cognitive Modes
Paper • 2602.10063 • Published • 75
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 33 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 63 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 48 -
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Paper • 2511.07384 • Published • 19
-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 63 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53
-
ControlLLM: Augment Language Models with Tools by Searching on Graphs
Paper • 2310.17796 • Published • 18 -
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster
Paper • 2311.08263 • Published • 16 -
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper • 2501.12599 • Published • 128 -
ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning
Paper • 2502.04689 • Published • 9
-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 93 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 42 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 18 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 102 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 42
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 33 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 63 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 48 -
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Paper • 2511.07384 • Published • 19
-
MinerU Document Extraction Tools
📚582Easy converting PDF and Office docs into Markdown and JSON
-
RAG-Anything: All-in-One RAG Framework
Paper • 2510.12323 • Published • 73 -
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 93 -
Chain of Mindset: Reasoning with Adaptive Cognitive Modes
Paper • 2602.10063 • Published • 75
-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 63 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
ControlLLM: Augment Language Models with Tools by Searching on Graphs
Paper • 2310.17796 • Published • 18 -
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster
Paper • 2311.08263 • Published • 16 -
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper • 2501.12599 • Published • 128 -
ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning
Paper • 2502.04689 • Published • 9