Zhiyuan Xu's picture

3

Zhiyuan Xu

zhiyuan16bristol

·

AI & ML interests

None yet

Recent Activity

authored a paper about 9 hours ago

Steering in the Shadows: Causal Amplification for Activation Space Attacks in Large Language Models

authored a paper about 9 hours ago

RouteHijack: Routing-Aware Attack on Mixture-of-Experts LLMs

authored a paper about 9 hours ago

The dark deep side of DeepSeek: Fine-tuning attacks against the safety alignment of CoT-enabled models

View all activity

Organizations

None yet

zhiyuan16bristol 's datasets

None public yet