Trained on a local 3090:
1: 4bit qlora to loosen both censorship/refusals, and thinking tags use. (1 epoch at 7e-5, [sharegpt chatml] using 4chan/reddit and toxic dpo data).
2: Abliterated using a custom fork of heretic.
Model tree for Nitral-Archive/Qwen3.5-27B_Homebrew