Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2509.19284

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Paper • 2510.00938 • Published Oct 1, 2025 • 60
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
Learning to Reason as Action Abstractions with Scalable Mid-Training RL

Paper • 2509.25810 • Published Sep 30, 2025 • 6
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 277

Reasoning Evaluation

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23

A Tale of Tails: Model Collapse as a Change of Scaling Laws

Paper • 2402.07043 • Published Feb 10, 2024 • 15
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System

Paper • 2509.18091 • Published Sep 22, 2025 • 34
Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

Paper • 2509.18058 • Published Sep 22, 2025 • 12

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System

Paper • 2509.18091 • Published Sep 22, 2025 • 34

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published Aug 13, 2025 • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published Aug 14, 2025 • 20
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6, 2025 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published Aug 19, 2025 • 48

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
Soft Tokens, Hard Truths

Paper • 2509.19170 • Published Sep 23, 2025 • 16
CompLLM: Compression for Long Context Q&A

Paper • 2509.19228 • Published Sep 23, 2025 • 10
Test-Time Scaling in Reasoning Models Is Not Effective for Knowledge-Intensive Tasks Yet

Paper • 2509.06861 • Published Sep 8, 2025 • 9

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

Paper • 2510.01171 • Published Oct 1, 2025 • 19

Interesting reads

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23

Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30, 2025 • 72
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published Sep 3, 2025 • 23
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations

Paper • 2509.03405 • Published Sep 3, 2025 • 24
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs

Paper • 2509.00930 • Published Aug 31, 2025 • 5

Snowflake/Arctic-Text2SQL-R1-7B

8B • Updated May 29, 2025 • 8.12k • 70
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 282
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 265
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19, 2025 • 133

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Paper • 2510.00938 • Published Oct 1, 2025 • 60
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
Learning to Reason as Action Abstractions with Scalable Mid-Training RL

Paper • 2509.25810 • Published Sep 30, 2025 • 6
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 277

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
Soft Tokens, Hard Truths

Paper • 2509.19170 • Published Sep 23, 2025 • 16
CompLLM: Compression for Long Context Q&A

Paper • 2509.19228 • Published Sep 23, 2025 • 10
Test-Time Scaling in Reasoning Models Is Not Effective for Knowledge-Intensive Tasks Yet

Paper • 2509.06861 • Published Sep 8, 2025 • 9

Reasoning Evaluation

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

Paper • 2510.01171 • Published Oct 1, 2025 • 19

A Tale of Tails: Model Collapse as a Change of Scaling Laws

Paper • 2402.07043 • Published Feb 10, 2024 • 15
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System

Paper • 2509.18091 • Published Sep 22, 2025 • 34
Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

Paper • 2509.18058 • Published Sep 22, 2025 • 12

Interesting reads

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System

Paper • 2509.18091 • Published Sep 22, 2025 • 34

Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30, 2025 • 72
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published Sep 3, 2025 • 23
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations

Paper • 2509.03405 • Published Sep 3, 2025 • 24
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs

Paper • 2509.00930 • Published Aug 31, 2025 • 5

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published Aug 13, 2025 • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published Aug 14, 2025 • 20
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6, 2025 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published Aug 19, 2025 • 48

Snowflake/Arctic-Text2SQL-R1-7B

8B • Updated May 29, 2025 • 8.12k • 70
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 282
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 265
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19, 2025 • 133

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs