Training a Student Expert via Semi-Supervised Foundation Model Distillation Paper • 2604.03841 • Published 10 days ago • 10
ViVa: A Video-Generative Value Model for Robot Reinforcement Learning Paper • 2604.08168 • Published 5 days ago • 16
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 5 days ago • 271
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 6 days ago • 306
CUE-R: Beyond the Final Answer in Retrieval-Augmented Generation Paper • 2604.05467 • Published 7 days ago • 7
Can Natural Image Autoencoders Compactly Tokenize fMRI Volumes for Long-Range Dynamics Modeling? Paper • 2604.03619 • Published 10 days ago • 7
Watch Before You Answer: Learning from Visually Grounded Post-Training Paper • 2604.05117 • Published 8 days ago • 35
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision Paper • 2604.04934 • Published 8 days ago • 42
Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning Paper • 2604.05404 • Published 7 days ago • 41
ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement Paper • 2604.01591 • Published 12 days ago • 39
GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance Engineers Paper • 2604.02648 • Published 11 days ago • 45
ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation Paper • 2604.03922 • Published 9 days ago • 53
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published 7 days ago • 113
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published 8 days ago • 231
BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs Paper • 2604.02045 • Published 12 days ago • 33
Synthetic Sandbox for Training Machine Learning Engineering Agents Paper • 2604.04872 • Published 8 days ago • 14
Scaling Teams or Scaling Time? Memory Enabled Lifelong Learning in LLM Multi-Agent Systems Paper • 2604.03295 • Published 18 days ago • 10
Type-Checked Compliance: Deterministic Guardrails for Agentic Financial Systems Using Lean 4 Theorem Proving Paper • 2604.01483 • Published 13 days ago • 7