Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails Paper • 2504.11168 • Published Apr 15, 2025 • 3
rogue-security/prompt-injection-jailbreak-sentinel-v2 Text Classification • 0.6B • Updated Mar 11 • 18.1k • 28
nvidia/Aegis-AI-Content-Safety-LlamaGuard-Defensive-1.0 Text Classification • Updated Sep 22, 2025 • 10.3k • 30
nvidia/Aegis-AI-Content-Safety-LlamaGuard-Permissive-1.0 Text Classification • Updated Sep 22, 2025 • 6.82k • 18
ShieldGemma Collection ShieldGemma is a family of models for text and image content moderation. • 4 items • Updated Mar 12 • 13