qwen35-4b-oblit-dense-gguf

GGUF release of the baked dense-search candidate based on Qwen/Qwen3.5-4B.

This folder contains:

  • a full-fidelity text GGUF
  • a practical spread of text GGUF quants
  • the matching multimodal projector for image support

Files

  • qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.f16.gguf
  • qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q8_0.gguf
  • qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q6_K.gguf
  • qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q5_K_M.gguf
  • qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q5_K_S.gguf
  • qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf
  • qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_S.gguf
  • mmproj-Qwen3.5-4B-F16.gguf

Recommended default:

  • text-only: qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf
  • multimodal: qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf with mmproj-Qwen3.5-4B-F16.gguf

Quant guide:

  • F16: highest fidelity, largest file
  • Q8_0: near-lossless quant
  • Q6_K: high-quality smaller quant
  • Q5_K_M: balanced quality/size
  • Q5_K_S: slightly smaller than Q5_K_M
  • Q4_K_M: practical default
  • Q4_K_S: smallest file in this release

Source candidate

  • Baked checkpoint: ../qwen35-4b-oblit-dense-baked
  • Base model: Qwen/Qwen3.5-4B
  • Dense-search edit: layer 20, strength 1.50, targets attention, embeddings, linear_attention, mlp

Evaluation snapshot

  • Harmful refusals: 128/128 -> 11/128
  • Harmless refusals: 9/131 -> 5/131
  • Harmful refusals (Minos): 128/128 -> 28/128
  • Harmless refusals (Minos): 3/131 -> 4/131
  • Harmful KL mean: 0.0005979276029393077
  • Harmless KL mean: 4.5933134970255196e-05

Text-Only Usage

llama-cli \
  -m qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf \
  -p "Explain overfitting in plain English."

Image Usage

llama-mtmd-cli \
  -m qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf \
  --mmproj mmproj-Qwen3.5-4B-F16.gguf \
  --image example.jpg \
  -p "Describe this image."

Server Usage

llama-server \
  -m qwen3.5-4b-obliterate-dense-search-step1-no_decoder_input.Q4_K_M.gguf \
  --mmproj mmproj-Qwen3.5-4B-F16.gguf

Notes

  • The mmproj-Qwen3.5-4B-F16.gguf file is the matching projector for multimodal use.
  • If you only want text inference, the projector is not required.
  • The Q4_K_M file is the practical default. Q6_K and Q8_0 are there if you want to trade space for fidelity.
  • The F16 file is the highest-fidelity GGUF export in this folder.
Downloads last month
376
GGUF
Model size
4B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for amkkk/Qwen3.5-4B-abliterated-aggressive-GGUF

Finetuned
Qwen/Qwen3.5-4B
Quantized
(144)
this model