view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) +2 Dec 9, 2022 • 410
Ouro Collection a family of pre-trained Looped Language Models. • 4 items • Updated Oct 29, 2025 • 28
Open Character Training Collection https://arxiv.org/abs/2511.01689 • 8 items • Updated Nov 4, 2025 • 7
Alignment Pretraining (Geodesic, 2025): Data & Models Collection https://alignmentpretraining.ai — Read our paper for additional details about our data and models • 5 items • Updated Jan 16 • 7