Shorter but not Worse: Frugal Reasoning via Easy Samples as Length Regularizers in Math RLVR
Paper • 2511.01937 • Published • 16
AI models for language, speech and beyond.
YaPO: Learnable Sparse Activation Steering Vectors for Domain Adaptation
Shorter but not Worse: Frugal Reasoning via Easy Samples as Length Regularizers in Math RLVR