Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Croc-Prog-HF
's Collections
Chat-Style and Reasoning Datasets
Synthetic Data Generation & Datasets
Deepfake & AI content detection
Bias, Misalignment, and AI Safety
Benchmark datasets
LoreWeaver-2 Family
MultiLang-Texts HQ Datasets
Math-HQ-datasets
Bias, Misalignment, and AI Safety
updated
Mar 10
Human Values Alignment, Jailbreaking Prevention, Bias Mitigation
Upvote
-
hendrycks/ethics
Viewer
•
Updated
Apr 19, 2023
•
134k
•
1.21k
•
28
Frontier-AI-Research/MORALISE
Viewer
•
Updated
Oct 27, 2025
•
2.57k
•
1.9k
Stereotypes-in-LLMs/UAlign
Viewer
•
Updated
May 31, 2025
•
5.38k
•
64
PKU-Alignment/PKU-SafeRLHF
Viewer
•
Updated
Oct 18, 2024
•
164k
•
15k
•
182
allenai/wildjailbreak
Viewer
•
Updated
Aug 8, 2024
•
2.21k
•
9.36k
•
129
usail-hkust/JailJudge
Preview
•
Updated
Nov 20, 2024
•
48
•
3
gretelai/gretel-safety-alignment-en-v1
Viewer
•
Updated
Dec 17, 2025
•
16.7k
•
274
•
22
fwnlp/self-instruct-safety-alignment
Viewer
•
Updated
Oct 23, 2024
•
12k
•
75
•
3
ai-safety-institute/AgentHarm
Viewer
•
Updated
Dec 19, 2024
•
468
•
5.84k
•
55
Anthropic/discrim-eval
Viewer
•
Updated
Jan 5, 2024
•
18.9k
•
581
•
55
Upvote
-
Share collection
View history
Collection guide
Browse collections