[Question] Best uncensoring approach for MoE models like Qwen3.5-35B-A3B?
#14
by Fenrenaf446 - opened
I'm exploring uncensored variants of the Qwen3.5-35B-A3B and noticed it uses a MoE (Mixture-of-Experts) architecture with sparse activation (~3B active params out of 35B total). Since traditional uncensoring techniques like abliteration are typically designed for dense models, I'm curious: what approach works best for MoE architectures? Does the sparse routing mechanism require specialized methods (e.g., router-aware fine-tuning) compared to standard dense models?
Only o_proj, out_proj, and down_proj have been modified; you can compare them accordingly.
去试试 llmfan46/Qwen3.5-35B-A3B-heretic-v1
brayniac/Qwen3.5-35B-A3B-heretic
这种新的技术heretic 效果好得多
Skibidi