TAB-PO: Preference Optimization with a Token-Level Adaptive Barrier for Token-Critical Structured Generation
Paper • 2603.00025 • Published
TAB-PO: Preference Optimization with a Token-Level Adaptive Barrier for Token-Critical Structured Generation