Gemma 4 E4B Abliterated GGUF (4-bit)

Model Description

This repository contains the Gemma 4 E4B model after undergoing "abliteration"—a process to remove refusal vectors while preserving the model's core intelligence. This version is particularly effective for research and creative use cases where strict adherence to "safety" refusals may be undesirable.

Abliteration Results

Method: Norm-preserving biprojection (orthogonalization).
Target Layers: Layers 0-41 (Independent layer targeting for maximum stability).
Initial Refusal Rate: ~100/100 (Standard Google alignment).
Final Refusal Rate: 3/100 (Highly compliant).
KL Divergence: 0.0671 (Extremely low, indicating high intelligence preservation).
Technique: Expert-Granular Abliteration (EGA) compatibility via patched heretic-llm.

Quantization Details

Quantization Format: GGUF (q4_k_m)
Quantization Method: llama.cpp / Unsloth
Precision: 4-bit

Use with Ollama

ollama run hf.co/DuoNeural/Gemma-4-E4B-Abliterated-GGUF

Use with LM Studio

Open LM Studio.
Search for DuoNeural/Gemma-4-E4B-Abliterated-GGUF.
Load the Q4_K_M GGUF.

Architecture

Gemma 4 E4B features 4.5B effective parameters (8B total), optimized for intelligence-per-parameter and edge device deployment.

Disclaimer

This model has had its safety refusals removed. Users are responsible for ensuring the model is used ethically and in accordance with applicable laws.

Downloads last month: 2,312

GGUF

Model size

8B params

Architecture

gemma4

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DuoNeural/Gemma-4-E4B-Abliterated-GGUF

Base model

google/gemma-4-E4B-it

Quantized

(112)

this model