Cornell-AGI 's Collections REBEL: Reinforcement Learning via Regressing Relative Reward
updated
REBEL: Reinforcement Learning via Regressing Relative Rewards
Paper
• 2404.16767
• Published • 2
Cornell-AGI/REBEL-Llama-3-Armo-iter_1
8B • Updated • 2
• 1
Cornell-AGI/REBEL-Llama-3-Armo-iter_2
8B • Updated • 2
• 1
Cornell-AGI/REBEL-Llama-3-Armo-iter_3
8B • Updated • 3
• 2
Cornell-AGI/Ultrafeedback-Llama-3-Armo-iter_1
Viewer
• Updated • 56.1k • 7
Cornell-AGI/Ultrafeedback-Llama-3-Armo-iter_2
Viewer
• Updated • 55.1k • 7
Cornell-AGI/Ultrafeedback-Llama-3-Armo-iter_3
Viewer
• Updated • 44.6k • 7
• 1
Cornell-AGI/REBEL-Llama-3
Text Generation
• Updated • 6
• 1
Cornell-AGI/REBEL-Llama-3-epoch_2
Text Generation
• Updated • 12
• 3
Cornell-AGI/REBEL-OpenChat-3.5
Text Generation
• Updated • 14
• 1