Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation Paper • 2508.12040 • Published Aug 16, 2025 • 14
A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models Paper • 2508.12903 • Published Aug 18, 2025 • 11
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 191