Gemma 4 E4B Abliterated GGUF (4-bit)

Model Description

This repository contains the Gemma 4 E4B model after undergoing "abliteration"—a process to remove refusal vectors while preserving the model's core intelligence. This version is particularly effective for research and creative use cases where strict adherence to "safety" refusals may be undesirable.

Abliteration Results

  • Method: Norm-preserving biprojection (orthogonalization).
  • Target Layers: Layers 0-41 (Independent layer targeting for maximum stability).
  • Initial Refusal Rate: ~100/100 (Standard Google alignment).
  • Final Refusal Rate: 3/100 (Highly compliant).
  • KL Divergence: 0.0671 (Extremely low, indicating high intelligence preservation).
  • Technique: Expert-Granular Abliteration (EGA) compatibility via patched heretic-llm.

Quantization Details

  • Quantization Format: GGUF (q4_k_m)
  • Quantization Method: llama.cpp / Unsloth
  • Precision: 4-bit

Use with Ollama

ollama run hf.co/DuoNeural/Gemma-4-E4B-Abliterated-GGUF

Use with LM Studio

  1. Open LM Studio.
  2. Search for DuoNeural/Gemma-4-E4B-Abliterated-GGUF.
  3. Load the Q4_K_M GGUF.

Architecture

Gemma 4 E4B features 4.5B effective parameters (8B total), optimized for intelligence-per-parameter and edge device deployment.

Disclaimer

This model has had its safety refusals removed. Users are responsible for ensuring the model is used ethically and in accordance with applicable laws.

Downloads last month
2,312
GGUF
Model size
8B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DuoNeural/Gemma-4-E4B-Abliterated-GGUF

Quantized
(112)
this model