Qwen3-4B-abliterated-f32-GGUFs

Qwen3-4B-abliterated is an experimental, uncensored version of the Qwen/Qwen3-4B language model that explores how refusals and latent fine-tuning work in large language models using a novel "abliteration" technique, which subtracts a computed refusal direction from hidden module states (such as o_proj) to minimize refusals without degrading output quality. The process involves comparing residual streams between harmful and harmless prompts, orthogonalizing hidden states with weight factors distributed across layers, and iterative or accumulated orthogonalization methods for efficiency.

Model Files

File name	Size	Quant Type
Qwen3-4B-abliterated.F32.gguf	16.1 GB	F32
Qwen3-4B-abliterated.BF16.gguf	8.05 GB	BF16
Qwen3-4B-abliterated.F16.gguf	8.05 GB	F16
Qwen3-4B-abliterated.Q8_0.gguf	4.28 GB	Q8_0
Qwen3-4B-abliterated.Q6_K.gguf	3.31 GB	Q6_K
Qwen3-4B-abliterated.Q5_K_M.gguf	2.89 GB	Q5_K_M
Qwen3-4B-abliterated.Q5_K_S.gguf	2.82 GB	Q5_K_S
Qwen3-4B-abliterated.Q4_K_M.gguf	2.5 GB	Q4_K_M
Qwen3-4B-abliterated.Q4_K_S.gguf	2.38 GB	Q4_K_S
Qwen3-4B-abliterated.Q3_K_L.gguf	2.24 GB	Q3_K_L
Qwen3-4B-abliterated.Q3_K_M.gguf	2.08 GB	Q3_K_M
Qwen3-4B-abliterated.Q3_K_S.gguf	1.89 GB	Q3_K_S
Qwen3-4B-abliterated.Q2_K.gguf	1.67 GB	Q2_K

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

Downloads last month: 408

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Model tree for prithivMLmods/Qwen3-4B-abliterated-f32-GGUFs

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Quantized

(210)

this model