Meta-Harness: End-to-End Optimization of Model Harnesses Paper • 2603.28052 • Published 16 days ago • 18
Flipping the Dialogue: Training and Evaluating User Language Models Paper • 2510.06552 • Published Oct 8, 2025 • 1
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards Paper • 2505.24760 • Published May 30, 2025 • 74
Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR Paper • 2509.02522 • Published Sep 2, 2025 • 25 • 6
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic Paper • 2509.01363 • Published Sep 1, 2025 • 61
Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR Paper • 2509.02522 • Published Sep 2, 2025 • 25