PKU-Alignment/ProgressGym-HistLlama3-8B-C014-pretrain-v0.2 Text Generation • 8B • Updated Aug 10, 2024 • 3
PKU-Alignment/ProgressGym-HistLlama3-8B-C013-instruct-v0.2 Text Generation • 8B • Updated Aug 10, 2024 • 5
PKU-Alignment/ProgressGym-HistLlama3-8B-C013-pretrain-v0.2 Text Generation • 8B • Updated Aug 10, 2024 • 1
PKU-Alignment/beaver-7b-unified-reward Reinforcement Learning • 7B • Updated Apr 20, 2024 • 565
PKU-Alignment/beaver-7b-unified-cost Reinforcement Learning • 7B • Updated Apr 20, 2024 • 659 • 2
PKU-Alignment/beaver-7b-v1.0-reward Reinforcement Learning • 7B • Updated Apr 20, 2024 • 573 • 17
PKU-Alignment/beaver-7b-v1.0-cost Reinforcement Learning • 7B • Updated Apr 20, 2024 • 1.22k • 10