gemma-4-E2B-it-saber โ€” GGUF

GGUF quantizations of DJLougen/gemma-4-E2B-it-saber, a surgically-modified version of google/gemma-4-E2B-it using SABER (Spectral Analysis-Based Entanglement Resolution) that removes safety refusal behavior while preserving model capability.

Quantizations

File Quant Size Description
gemma-4-E2B-it-saber-BF16.gguf BF16 8.7 GB Full precision (brain float 16)
gemma-4-E2B-it-saber-Q8_0.gguf Q8_0 4.7 GB Near-lossless
gemma-4-E2B-it-saber-Q6_K.gguf Q6_K 3.6 GB Very high quality
gemma-4-E2B-it-saber-Q5_K_M.gguf Q5_K_M 3.4 GB High quality
gemma-4-E2B-it-saber-Q5_K_S.gguf Q5_K_S 3.4 GB High quality, smaller
gemma-4-E2B-it-saber-Q4_K_M.gguf Q4_K_M 3.2 GB Good quality โ€” recommended
gemma-4-E2B-it-saber-Q4_K_S.gguf Q4_K_S 3.2 GB Good quality, smaller
gemma-4-E2B-it-saber-IQ4_XS.gguf IQ4_XS 3.1 GB Good quality, compact
gemma-4-E2B-it-saber-Q3_K_L.gguf Q3_K_L 3.1 GB Lower quality
gemma-4-E2B-it-saber-Q3_K_M.gguf Q3_K_M 3.0 GB Low quality
gemma-4-E2B-it-saber-IQ3_M.gguf IQ3_M 3.0 GB Medium-low, IQ
gemma-4-E2B-it-saber-Q3_K_S.gguf Q3_K_S 2.9 GB Low quality, small
gemma-4-E2B-it-saber-Q2_K.gguf Q2_K 2.8 GB Very low quality
gemma-4-E2B-it-saber-mmproj-BF16.gguf BF16 942 MB Vision/audio encoder (multimodal projector)

Multimodal Usage

This is a multimodal model. To use vision/audio features with llama.cpp, you need both a text model GGUF and the mmproj file:

llama-cli -m gemma-4-E2B-it-saber-Q4_K_M.gguf --mmproj gemma-4-E2B-it-saber-mmproj-BF16.gguf -p "Describe this image" --image photo.jpg

Method Overview

SABER identifies and ablates the refusal circuit in an LLM through a multi-stage process:

  1. Probing โ€” Extract activation profiles from both harmful and harmless inputs across all transformer layers
  2. Spectral Analysis โ€” Decompose activation differences into individual refusal directions
  3. Entanglement Quantification โ€” Measure overlap between refusal and capability subspaces
  4. Targeted Ablation โ€” Remove only pure-refusal components
  5. Iterative Refinement โ€” Re-probe after each pass to catch hydra effects

Results

Model Refusal Rate Perplexity
google/gemma-4-E2B-it (baseline) 100% 498
SABER-refined (this model) 0% 450 (-9.6%)

Warning

This model will comply with any request, including harmful ones. It is intended solely for research into alignment, safety, and model behavior.

Downloads last month
2,482
GGUF
Model size
5B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DJLougen/gemma-4-E2B-it-saber-GGUF

Quantized
(1)
this model