qwen35-4b-oblit-dense-gguf

GGUF release of the baked dense-search candidate based on Qwen/Qwen3.5-4B.

This folder contains:

a full-fidelity text GGUF
a practical spread of text GGUF quants
the matching multimodal projector for image support

Files

qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.f16.gguf
qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q8_0.gguf
qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q6_K.gguf
qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q5_K_M.gguf
qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q5_K_S.gguf
qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf
qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_S.gguf
mmproj-Qwen3.5-4B-F16.gguf

Recommended default:

text-only: qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf
multimodal: qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf with mmproj-Qwen3.5-4B-F16.gguf

Quant guide:

F16: highest fidelity, largest file
Q8_0: near-lossless quant
Q6_K: high-quality smaller quant
Q5_K_M: balanced quality/size
Q5_K_S: slightly smaller than Q5_K_M
Q4_K_M: practical default
Q4_K_S: smallest file in this release

Source candidate

Baked checkpoint: ../qwen35-4b-oblit-dense-baked
Base model: Qwen/Qwen3.5-4B
Dense-search edit: layer 20, strength 1.50, targets attention, embeddings, linear_attention, mlp

Evaluation snapshot

Harmful refusals: 128/128 -> 11/128
Harmless refusals: 9/131 -> 5/131
Harmful refusals (Minos): 128/128 -> 28/128
Harmless refusals (Minos): 3/131 -> 4/131
Harmful KL mean: 0.0005979276029393077
Harmless KL mean: 4.5933134970255196e-05

Text-Only Usage

llama-cli \
  -m qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf \
  -p "Explain overfitting in plain English."

Image Usage

llama-mtmd-cli \
  -m qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf \
  --mmproj mmproj-Qwen3.5-4B-F16.gguf \
  --image example.jpg \
  -p "Describe this image."

Server Usage

llama-server \
  -m qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf \
  --mmproj mmproj-Qwen3.5-4B-F16.gguf

Notes

The mmproj-Qwen3.5-4B-F16.gguf file is the matching projector for multimodal use.
If you only want text inference, the projector is not required.
The Q4_K_M file is the practical default. Q6_K and Q8_0 are there if you want to trade space for fidelity.
The F16 file is the highest-fidelity GGUF export in this folder.

Downloads last month: 376

GGUF

Model size

4B params

Architecture

qwen35

Hardware compatibility

4-bit

5-bit

6-bit

8-bit

16-bit

Model tree for amkkk/Qwen3.5-4B-abliterated-aggressive-GGUF

Base model

Qwen/Qwen3.5-4B-Base

Finetuned

Qwen/Qwen3.5-4B

Quantized

(144)

this model