FLAME: Factuality-Aware Alignment for Large Language Models Paper • 2405.01525 • Published May 2, 2024 • 30
Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment Paper • 2508.07750 • Published Aug 11, 2025 • 21
Towards Scalable Automated Alignment of LLMs: A Survey Paper • 2406.01252 • Published Jun 3, 2024 • 3
Alignment is not sufficient to prevent large language models from generating harmful information: A psychoanalytic perspective Paper • 2311.08487 • Published Nov 14, 2023 • 2
AI and Safety Collection We published in several top NLP/AI conferences such as ACL, EMNLP, AAAI, ICWSM • 11 items • Updated Oct 16, 2025 • 6