ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack
Paper • 2509.25843 • Published • 18
None defined yet.
ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack
The Curious Case of Analogies: Investigating Analogical Reasoning in Large Language Models