Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2602.06717

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

Diffusion Language Models Know the Answer Before Decoding

Paper • 2508.19982 • Published Aug 27, 2025 • 27
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Paper • 2512.13586 • Published Dec 15, 2025 • 93
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Paper • 2601.06431 • Published Jan 10 • 12
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published Jan 14 • 63

F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

Paper • 2602.06717 • Published Feb 6 • 74

Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published Jan 23 • 18
Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published Jan 28 • 21
Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 102
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published Jan 26 • 42

DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning

Paper • 2602.19895 • Published Feb 23 • 14
SetPO: Set-Level Policy Optimization for Diversity-Preserving LLM Reasoning

Paper • 2602.01062 • Published Feb 1
F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

Paper • 2602.06717 • Published Feb 6 • 74
MC-GRPO: Median-Centered Group Relative Policy Optimization for Small-Rollout Reinforcement Learning

Paper • 2601.22582 • Published Jan 30

F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

Paper • 2602.06717 • Published Feb 6 • 74

WTF GENIUS PAPERS

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models.

Diffusion Language Models Know the Answer Before Decoding

Paper • 2508.19982 • Published Aug 27, 2025 • 27
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Paper • 2512.13586 • Published Dec 15, 2025 • 93
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following

Paper • 2601.06431 • Published Jan 10 • 12
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published Jan 14 • 63

DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning

Paper • 2602.19895 • Published Feb 23 • 14
SetPO: Set-Level Policy Optimization for Diversity-Preserving LLM Reasoning

Paper • 2602.01062 • Published Feb 1
F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

Paper • 2602.06717 • Published Feb 6 • 74
MC-GRPO: Median-Centered Group Relative Policy Optimization for Small-Rollout Reinforcement Learning

Paper • 2601.22582 • Published Jan 30

F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

Paper • 2602.06717 • Published Feb 6 • 74

F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

Paper • 2602.06717 • Published Feb 6 • 74

Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published Jan 23 • 18
Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published Jan 28 • 21
Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 102
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published Jan 26 • 42

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs