File size: 5,447 Bytes
0200469
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2ffc029
0200469
 
 
 
 
 
 
 
 
 
 
 
 
2ffc029
 
0200469
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
---
license: gemma
license_link: https://ai.google.dev/gemma/terms
library_name: mlx
pipeline_tag: text-generation
base_model: Jiunsong/supergemma4-e4b-abliterated
base_model_relation: quantized
language:
- en
- ko
tags:
- gemma
- gemma-4
- gemma4
- abliterated
- uncensored
- uncensored-llm
- no-refusal
- mlx
- apple-silicon
- m-series
- mac
- quantized
- conversational
- roleplay
- text-generation
quantized_by: dancinlab
inference: false
---

# Uncensored Gemma 4 (SuperGemma4 E4B Abliterated) — MLX for Apple Silicon

**Uncensored / abliterated Gemma-4** for Apple Silicon — MLX builds that
**actually load on stock `mlx-lm`**. Most community MLX uploads of this base
fail with `Missing 963 parameters`; this repo's conversion fixes both root
causes so it loads and generates on a clean `pip install mlx-lm`.

```bash
pip install -U mlx-lm   # needs mlx-lm >= 0.31.3 (native gemma4 arch)

# 4-bit — recommended for 16 GB / 24 GB Macs
mlx_lm.generate --model dancinlab/supergemma4-e4b-abliterated-MLX-4bit \
  --prompt "Who are you?" --max-tokens 60

# interactive chat
mlx_lm.chat --model dancinlab/supergemma4-e4b-abliterated-MLX-4bit
```

## Builds (3 separate repos)

| Repo | Size | Peak RAM | tok/s (M-series) | Use |
|---|---:|---:|---:|---|
| **`-MLX-4bit`** | 3.9 GB | 5.4 GB | ~11 | **recommended** — 16 GB / 24 GB Mac |
| `-MLX-8bit`     | 7.4 GB | 9.1 GB | ~6  | 32 GB+ Mac, higher fidelity |
| `-MLX-bf16`     | 14 GB  | 8.6 GB | ~3  | reference, full precision |

Verified on stock `mlx-lm==0.31.3`: coherent multilingual output (English +
Korean) and correct arithmetic (`2+2=` → 4). **Text-only** — the upstream abliterated safetensors
contain no vision/audio tower weights, so multimodal MLX is upstream-blocked,
not a tooling limitation.

## Why community MLX builds fail (and how this one is fixed)

`Gemma4ForConditionalGeneration` is multimodal (text + vision + audio). Two
independent problems break naive conversion:

1. **963-tensor multimodal/text mismatch.** `mlx-vlm` always instantiates all
   three towers (1682 tensors); the abliterated text-only release has 719
   (missing = audio 751 + vision 210 + embed 2). **Fixed by stock code**`mlx-lm >= 0.31.3` ships a native `gemma4`/`gemma4_text` arch whose
   `sanitize` strips vision/audio/embed and remaps `model.language_model.*`.
   No patch needed for this part.

2. **54-tensor KV-shared residue.** Gemma-4 e4b shares K/V across the last 18
   layers (24–41), but the upstream safetensors physically still carry the
   dropped `k_proj`/`v_proj`/`k_norm` for those layers → strict-load failure.
   This fix landed on `mlx-lm` `main` **after** the 0.31.3 tag
   (`ml-explore/mlx-lm#1240`), so it is **not in any pip release yet**. This
   repo applies the #1240 `sanitize` logic as a **convert-time monkey-patch**
   (no mlx-lm / mlx-vlm / transformers fork). Effect: 719 → 665 tensors
   (exactly 54 stripped).

The patch is needed **only at conversion time**. The shipped weights here
load on plain stock `mlx-lm>=0.31.3` with no patch on your side — that is the
gap that makes other MLX uploads of this model unusable.

> Note: `mlx-lm` 0.29.1 (common on Python 3.9) has **no gemma4 arch at all**
> you need 0.31.3+. On Python 3.9 mlx wheels cap at 0.29.3, so use a
> Python 3.11+/3.13 environment.

## Why abliterated

Upstream `Jiunsong/supergemma4-e4b-abliterated` removes refusal directions
from the residual stream of `google/gemma-4-E4B-it`. Upstream release-card
numbers (vs Google base):

| Metric | Google base | SuperGemma4 E4B Abliterated |
|---|---:|---:|
| Release quality | 77.46 | 92.34 |
| Exact overall  | 83.50 | 98.50 |
| JSON exact     | 50.0  | 100.0 |

Source: [`Jiunsong/supergemma4-e4b-abliterated`](https://huggingface.co/Jiunsong/supergemma4-e4b-abliterated) model card.

## What "abliterated" means and doesn't mean

- **Does:** reduces reflexive refusals; answers borderline-but-legal requests directly.
- **Does not:** remove confabulation; alter base knowledge / biases; replace
  your own safety layer at the application boundary.

## License — Gemma Terms of Use (must read)

Derivative of `google/gemma-4-E4B-it`, governed by the **Gemma Terms of Use**
(`license: gemma`):

- License: https://ai.google.dev/gemma/terms
- Prohibited use policy: https://ai.google.dev/gemma/prohibited_use_policy

By downloading or using these MLX builds you agree to the Gemma Terms of Use
and Prohibited Use Policy. Redistribution must include the same license terms.

## Lineage

```
google/gemma-4-E4B-it
  └── Jiunsong/supergemma4-e4b-abliterated   (abliteration + tuning)
        └── dancinlab/supergemma4-e4b-abliterated-MLX-{bf16,4bit,8bit}
```

Conversion: stock `mlx-lm==0.31.3` on Apple Silicon + a convert-time
`gemma4_text.sanitize` monkey-patch (verbatim `ml-explore/mlx-lm#1240`).
No mlx-lm / mlx-vlm / transformers fork.

## Credits

- Upstream model: [`Jiunsong`](https://huggingface.co/Jiunsong)
- Original base: [`google/gemma-4-E4B-it`](https://huggingface.co/google/gemma-4-E4B-it)
- MLX conversion + packaging: [`dancinlab`](https://huggingface.co/dancinlab)

Everywhere else (llama.cpp / Ollama / LM Studio): [`dancinlab/supergemma4-e4b-abliterated-GGUF`](https://huggingface.co/dancinlab/supergemma4-e4b-abliterated-GGUF) — Q2_K → BF16 + imatrix IQ.

Collection: [`dancinlab/uncensored`](https://huggingface.co/collections/dancinlab/uncensored-6a080743e6774450ba77a427).