Zhiyuan Xu's picture

3

Zhiyuan Xu

zhiyuan16bristol

·

AI & ML interests

None yet

Recent Activity

authored a paper about 8 hours ago

Steering in the Shadows: Causal Amplification for Activation Space Attacks in Large Language Models

authored a paper about 8 hours ago

RouteHijack: Routing-Aware Attack on Mixture-of-Experts LLMs

authored a paper about 8 hours ago

The dark deep side of DeepSeek: Fine-tuning attacks against the safety alignment of CoT-enabled models

View all activity

Organizations

None yet

authored 3 papers about 8 hours ago

Steering in the Shadows: Causal Amplification for Activation Space Attacks in Large Language Models

Paper • 2511.17194 • Published Nov 21, 2025 • 1

RouteHijack: Routing-Aware Attack on Mixture-of-Experts LLMs

Paper • 2605.02946 • Published 7 days ago • 1

The dark deep side of DeepSeek: Fine-tuning attacks against the safety alignment of CoT-enabled models

Paper • 2502.01225 • Published Feb 3, 2025 • 1