---
license: gemma
license_link: https://ai.google.dev/gemma/terms
library_name: mlx
pipeline_tag: text-generation
base_model: Jiunsong/supergemma4-e4b-abliterated
base_model_relation: quantized
language:
- en
- ko
tags:
- gemma
- gemma-4
- gemma4
- abliterated
- uncensored
- uncensored-llm
- no-refusal
- mlx
- apple-silicon
- m-series
- mac
- quantized
- conversational
- roleplay
- text-generation
quantized_by: dancinlab
inference: false
---

# Uncensored Gemma 4 (SuperGemma4 E4B Abliterated) — MLX for Apple Silicon

**Uncensored / abliterated Gemma-4** for Apple Silicon — MLX builds that
**actually load on stock `mlx-lm`**. Most community MLX uploads of this base
fail with `Missing 963 parameters`; this repo's conversion fixes both root
causes so it loads and generates on a clean `pip install mlx-lm`.

```bash
pip install -U mlx-lm   # needs mlx-lm >= 0.31.3 (native gemma4 arch)

# 4-bit — recommended for 16 GB / 24 GB Macs
mlx_lm.generate --model dancinlab/supergemma4-e4b-abliterated-MLX-bf16 \
  --prompt "Who are you?" --max-tokens 60

# interactive chat
mlx_lm.chat --model dancinlab/supergemma4-e4b-abliterated-MLX-bf16
```

## Builds (3 separate repos)

| Repo | Size | Peak RAM | tok/s (M-series) | Use |
|---|---:|---:|---:|---|
| **`-MLX-4bit`** | 3.9 GB | 5.4 GB | ~11 | **recommended** — 16 GB / 24 GB Mac |
| `-MLX-8bit`     | 7.4 GB | 9.1 GB | ~6  | 32 GB+ Mac, higher fidelity |
| `-MLX-bf16`     | 14 GB  | 8.6 GB | ~3  | reference, full precision |

Verified on stock `mlx-lm==0.31.3`: coherent multilingual output (English +
Korean) and correct arithmetic (`2+2=` → 4). **Text-only** — the upstream abliterated safetensors
contain no vision/audio tower weights, so multimodal MLX is upstream-blocked,
not a tooling limitation.

## Why community MLX builds fail (and how this one is fixed)

`Gemma4ForConditionalGeneration` is multimodal (text + vision + audio). Two
independent problems break naive conversion:

1. **963-tensor multimodal/text mismatch.** `mlx-vlm` always instantiates all
   three towers (1682 tensors); the abliterated text-only release has 719
   (missing = audio 751 + vision 210 + embed 2). **Fixed by stock code** —
   `mlx-lm >= 0.31.3` ships a native `gemma4`/`gemma4_text` arch whose
   `sanitize` strips vision/audio/embed and remaps `model.language_model.*`.
   No patch needed for this part.

2. **54-tensor KV-shared residue.** Gemma-4 e4b shares K/V across the last 18
   layers (24–41), but the upstream safetensors physically still carry the
   dropped `k_proj`/`v_proj`/`k_norm` for those layers → strict-load failure.
   This fix landed on `mlx-lm` `main` **after** the 0.31.3 tag
   (`ml-explore/mlx-lm#1240`), so it is **not in any pip release yet**. This
   repo applies the #1240 `sanitize` logic as a **convert-time monkey-patch**
   (no mlx-lm / mlx-vlm / transformers fork). Effect: 719 → 665 tensors
   (exactly 54 stripped).

The patch is needed **only at conversion time**. The shipped weights here
load on plain stock `mlx-lm>=0.31.3` with no patch on your side — that is the
gap that makes other MLX uploads of this model unusable.

> Note: `mlx-lm` 0.29.1 (common on Python 3.9) has **no gemma4 arch at all** —
> you need 0.31.3+. On Python 3.9 mlx wheels cap at 0.29.3, so use a
> Python 3.11+/3.13 environment.

## Why abliterated

Upstream `Jiunsong/supergemma4-e4b-abliterated` removes refusal directions
from the residual stream of `google/gemma-4-E4B-it`. Upstream release-card
numbers (vs Google base):

| Metric | Google base | SuperGemma4 E4B Abliterated |
|---|---:|---:|
| Release quality | 77.46 | 92.34 |
| Exact overall  | 83.50 | 98.50 |
| JSON exact     | 50.0  | 100.0 |

Source: [`Jiunsong/supergemma4-e4b-abliterated`](https://huggingface.co/Jiunsong/supergemma4-e4b-abliterated) model card.

## What "abliterated" means and doesn't mean

- **Does:** reduces reflexive refusals; answers borderline-but-legal requests directly.
- **Does not:** remove confabulation; alter base knowledge / biases; replace
  your own safety layer at the application boundary.

## License — Gemma Terms of Use (must read)

Derivative of `google/gemma-4-E4B-it`, governed by the **Gemma Terms of Use**
(`license: gemma`):

- License: https://ai.google.dev/gemma/terms
- Prohibited use policy: https://ai.google.dev/gemma/prohibited_use_policy

By downloading or using these MLX builds you agree to the Gemma Terms of Use
and Prohibited Use Policy. Redistribution must include the same license terms.

## Lineage

```
google/gemma-4-E4B-it
  └── Jiunsong/supergemma4-e4b-abliterated   (abliteration + tuning)
        └── dancinlab/supergemma4-e4b-abliterated-MLX-{bf16,4bit,8bit}
```

Conversion: stock `mlx-lm==0.31.3` on Apple Silicon + a convert-time
`gemma4_text.sanitize` monkey-patch (verbatim `ml-explore/mlx-lm#1240`).
No mlx-lm / mlx-vlm / transformers fork.

## Credits

- Upstream model: [`Jiunsong`](https://huggingface.co/Jiunsong)
- Original base: [`google/gemma-4-E4B-it`](https://huggingface.co/google/gemma-4-E4B-it)
- MLX conversion + packaging: [`dancinlab`](https://huggingface.co/dancinlab)

Everywhere else (llama.cpp / Ollama / LM Studio): [`dancinlab/supergemma4-e4b-abliterated-GGUF`](https://huggingface.co/dancinlab/supergemma4-e4b-abliterated-GGUF) — Q2_K → BF16 + imatrix IQ.

Collection: [`dancinlab/uncensored`](https://huggingface.co/collections/dancinlab/uncensored-6a080743e6774450ba77a427).