Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models Paper • 2604.08545 • Published 8 days ago • 41
Hybrid Reward Normalization for Process-supervised Non-verifiable Agentic Tasks Paper • 2509.25598 • Published Sep 29, 2025 • 2
Hybrid Reward Normalization for Process-supervised Non-verifiable Agentic Tasks Paper • 2509.25598 • Published Sep 29, 2025 • 2