-
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy
Paper • 2507.01352 • Published • 60 -
A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models
Paper • 2507.13563 • Published • 53 -
Scaling Laws for Optimal Data Mixtures
Paper • 2507.09404 • Published • 38 -
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper • 2511.14993 • Published • 233
Collections
Discover the best community collections!
Collections including paper arxiv:2508.21148
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 282 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 254 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 263
-
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
Paper • 2508.21148 • Published • 142 -
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Paper • 2504.10479 • Published • 308 -
Fara-7B: An Efficient Agentic Model for Computer Use
Paper • 2511.19663 • Published • 17
-
Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
Paper • 2508.09789 • Published • 5 -
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper • 2508.13186 • Published • 20 -
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents
Paper • 2508.04038 • Published • 1 -
Prompt Orchestration Markup Language
Paper • 2508.13948 • Published • 48
-
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 238 -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 193 -
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
Paper • 2508.21148 • Published • 142
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 628 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 302 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 320 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 213
-
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
Paper • 2508.21148 • Published • 142 -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 193 -
Architecture Decoupling Is Not All You Need For Unified Multimodal Model
Paper • 2511.22663 • Published • 29
-
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning
Paper • 2508.08221 • Published • 50 -
Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models
Paper • 2508.02120 • Published • 20 -
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers
Paper • 2506.23918 • Published • 90 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 238
-
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy
Paper • 2507.01352 • Published • 60 -
A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models
Paper • 2507.13563 • Published • 53 -
Scaling Laws for Optimal Data Mixtures
Paper • 2507.09404 • Published • 38 -
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper • 2511.14993 • Published • 233
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 282 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 254 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 263
-
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 238 -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 193 -
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
Paper • 2508.21148 • Published • 142
-
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
Paper • 2508.21148 • Published • 142 -
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Paper • 2504.10479 • Published • 308 -
Fara-7B: An Efficient Agentic Model for Computer Use
Paper • 2511.19663 • Published • 17
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 628 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 302 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 320 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 213
-
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
Paper • 2508.21148 • Published • 142 -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 193 -
Architecture Decoupling Is Not All You Need For Unified Multimodal Model
Paper • 2511.22663 • Published • 29
-
Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
Paper • 2508.09789 • Published • 5 -
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper • 2508.13186 • Published • 20 -
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents
Paper • 2508.04038 • Published • 1 -
Prompt Orchestration Markup Language
Paper • 2508.13948 • Published • 48
-
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning
Paper • 2508.08221 • Published • 50 -
Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models
Paper • 2508.02120 • Published • 20 -
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers
Paper • 2506.23918 • Published • 90 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 238