Qwen3-4B-abliterated-f32-GGUFs
Qwen3-4B-abliterated is an experimental, uncensored version of the Qwen/Qwen3-4B language model that explores how refusals and latent fine-tuning work in large language models using a novel "abliteration" technique, which subtracts a computed refusal direction from hidden module states (such as o_proj) to minimize refusals without degrading output quality. The process involves comparing residual streams between harmful and harmless prompts, orthogonalizing hidden states with weight factors distributed across layers, and iterative or accumulated orthogonalization methods for efficiency.
Model Files
| File name | Size | Quant Type |
|---|---|---|
| Qwen3-4B-abliterated.F32.gguf | 16.1 GB | F32 |
| Qwen3-4B-abliterated.BF16.gguf | 8.05 GB | BF16 |
| Qwen3-4B-abliterated.F16.gguf | 8.05 GB | F16 |
| Qwen3-4B-abliterated.Q8_0.gguf | 4.28 GB | Q8_0 |
| Qwen3-4B-abliterated.Q6_K.gguf | 3.31 GB | Q6_K |
| Qwen3-4B-abliterated.Q5_K_M.gguf | 2.89 GB | Q5_K_M |
| Qwen3-4B-abliterated.Q5_K_S.gguf | 2.82 GB | Q5_K_S |
| Qwen3-4B-abliterated.Q4_K_M.gguf | 2.5 GB | Q4_K_M |
| Qwen3-4B-abliterated.Q4_K_S.gguf | 2.38 GB | Q4_K_S |
| Qwen3-4B-abliterated.Q3_K_L.gguf | 2.24 GB | Q3_K_L |
| Qwen3-4B-abliterated.Q3_K_M.gguf | 2.08 GB | Q3_K_M |
| Qwen3-4B-abliterated.Q3_K_S.gguf | 1.89 GB | Q3_K_S |
| Qwen3-4B-abliterated.Q2_K.gguf | 1.67 GB | Q2_K |
Quants Usage
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):
- Downloads last month
- 408
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
32-bit
