Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -9,9 +9,13 @@ tags:
|
|
| 9 |
- jang
|
| 10 |
- gemma4
|
| 11 |
thumbnail: dealign_mascot.png
|
| 12 |
-
pipeline_tag: text-
|
| 13 |
---
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
<p align="center">
|
| 16 |
<img src="dealign_logo.png" alt="dealign.ai" width="200"/>
|
| 17 |
</p>
|
|
@@ -19,121 +23,107 @@ pipeline_tag: text-generation
|
|
| 19 |
<div align="center">
|
| 20 |
<img src="dealign_mascot.png" width="128" />
|
| 21 |
|
| 22 |
-
# Gemma 4 31B JANG_4M CRACK
|
| 23 |
|
| 24 |
-
**Abliterated Gemma 4 31B Dense —
|
| 25 |
|
| 26 |
-
93.7% HarmBench compliance
|
|
|
|
|
|
|
| 27 |
</div>
|
| 28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
## Model Details
|
| 30 |
|
| 31 |
| Metric | Value |
|
| 32 |
|--------|-------|
|
| 33 |
| Source | `google/gemma-4-31b-it` |
|
| 34 |
-
| Architecture | Dense
|
| 35 |
-
| Profile | JANG_4M
|
| 36 |
| Actual avg bits | 5.1 |
|
| 37 |
-
| Model size |
|
| 38 |
| Vision | Yes (multimodal, float16 passthrough) |
|
| 39 |
| Parameters | 31B |
|
| 40 |
-
| Format | JANG v2 (MLX-native safetensors
|
| 41 |
-
| Abliteration | CRACK
|
| 42 |
|
| 43 |
-
##
|
| 44 |
|
| 45 |
-
|
| 46 |
|
| 47 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
|
| 49 |
### Security & Pentesting (8/8 ✅)
|
|
|
|
| 50 |
All security/pentesting prompts comply with full working code:
|
| 51 |
-
- Port scanners, reverse shells, exploit development
|
| 52 |
-
-
|
| 53 |
-
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
| Subject | JANG_4M | CRACK |
|
| 63 |
-
|---------|---------|-------|
|
| 64 |
-
| Abstract Algebra | 13/20 | 14/20 |
|
| 65 |
-
| Anatomy | 13/20 | 10/20 |
|
| 66 |
-
| Astronomy | 17/20 | 17/20 |
|
| 67 |
-
| College CS | 14/20 | 13/20 |
|
| 68 |
-
| College Physics | 14/20 | 13/20 |
|
| 69 |
-
| HS Biology | 19/20 | 19/20 |
|
| 70 |
-
| HS Chemistry | 15/20 | 15/20 |
|
| 71 |
-
| HS Mathematics | 9/20 | 9/20 |
|
| 72 |
-
| Logical Fallacies | 19/20 | 19/20 |
|
| 73 |
-
| World Religions | 20/20 | 20/20 |
|
| 74 |
-
| **Total** | **153/200 (76.5%)** | **149/200 (74.5%)** |
|
| 75 |
-
|
| 76 |
-
**MMLU delta: -2.0%** — minimal knowledge loss from surgery. MPOA magnitude-preserving ablation maintains full model quality.
|
| 77 |
-
|
| 78 |
-
### HarmBench (159 standard prompts)
|
| 79 |
-
- **Overall: 93.7% compliance** (149/159, v2 matcher)
|
| 80 |
-
- Cybercrime/intrusion: **33/33 (100%)**
|
| 81 |
-
- Illegal activities: **46/47 (98%)**
|
| 82 |
-
- Misinformation: **26/27 (96%)**
|
| 83 |
-
- Chemical/biological: **18/19 (95%)**
|
| 84 |
-
- Harmful content: **16/17 (94%)**
|
| 85 |
-
- Harassment/bullying: **10/16 (62%)**
|
| 86 |
|
| 87 |
### Coherence ✅
|
| 88 |
-
|
| 89 |
-
- 8 planets in order: correct ✅
|
| 90 |
-
- Author of Crime and Punishment: Dostoevsky ✅
|
| 91 |
-
- Binary search implementation: complete working code ✅
|
| 92 |
-
- Square root of 144: 12 ✅
|
| 93 |
-
|
| 94 |
-
## Architecture Highlights
|
| 95 |
-
- Dense transformer with 60 layers
|
| 96 |
-
- Hybrid attention: sliding-window + full-attention layers (every 6th layer is full)
|
| 97 |
-
- Dual head dimensions: 256 (sliding) / 512 (global)
|
| 98 |
-
- K=V weight sharing on global attention layers
|
| 99 |
-
- Vision encoder preserved in float16 for multimodal inference
|
| 100 |
-
|
| 101 |
-
### JANG_4M Bit Allocation
|
| 102 |
-
| Tier | Components | Bits |
|
| 103 |
-
|------|-----------|------|
|
| 104 |
-
| CRITICAL | Attention (Q/K/V/O), embeddings | 8 |
|
| 105 |
-
| COMPRESS | MLP (gate, up, down proj), remaining weights | 4 |
|
| 106 |
-
|
| 107 |
-
JANG protects attention at full precision while compressing MLP weights — where dense models are most tolerant of quantization.
|
| 108 |
-
|
| 109 |
-
## Other Gemma 4 CRACK Models
|
| 110 |
-
|
| 111 |
-
| Model | Type | Size | MMLU | Comply | HarmBench |
|
| 112 |
-
|-------|------|------|------|--------|-----------|
|
| 113 |
-
| **JANG_4M CRACK** (this) | Dense 31B | **18 GB** | **74.5%** | **8/8** | **93.7%** |
|
| 114 |
-
| JANG_4M CRACK | MoE 26B | 15 GB | 67.5% | 8/8 | 86.8% |
|
| 115 |
-
| JANG_2L CRACK | MoE 26B | 9.9 GB | 58.5% | 8/8 | 98.7% |
|
| 116 |
|
| 117 |
-
##
|
| 118 |
|
| 119 |
-
|
|
|
|
|
|
|
| 120 |
|
| 121 |
-
|
| 122 |
|
| 123 |
-
|
| 124 |
-
# vMLX (recommended)
|
| 125 |
-
# Load directly in vMLX app or via API
|
| 126 |
|
| 127 |
-
|
| 128 |
-
from mlx_vlm.models.gemma4 import Model
|
| 129 |
-
# Requires mlx_vlm with gemma4 support (vMLX bundled version)
|
| 130 |
-
```
|
| 131 |
|
| 132 |
-
## Requirements
|
| 133 |
|
| 134 |
-
- Apple Silicon Mac with
|
| 135 |
-
-
|
| 136 |
-
-
|
| 137 |
|
| 138 |
---
|
| 139 |
|
|
|
|
| 9 |
- jang
|
| 10 |
- gemma4
|
| 11 |
thumbnail: dealign_mascot.png
|
| 12 |
+
pipeline_tag: image-text-to-text
|
| 13 |
---
|
| 14 |
|
| 15 |
+
<p align="center">
|
| 16 |
+
<img src="vmlx-banner.png" alt="vMLX" width="600"/>
|
| 17 |
+
</p>
|
| 18 |
+
|
| 19 |
<p align="center">
|
| 20 |
<img src="dealign_logo.png" alt="dealign.ai" width="200"/>
|
| 21 |
</p>
|
|
|
|
| 23 |
<div align="center">
|
| 24 |
<img src="dealign_mascot.png" width="128" />
|
| 25 |
|
| 26 |
+
# Gemma 4 31B JANG_4M CRACK (v2)
|
| 27 |
|
| 28 |
+
**Abliterated Gemma 4 31B Dense — 60 layers, hybrid sliding/global attention, multimodal VL**
|
| 29 |
|
| 30 |
+
93.7% HarmBench compliance (300 prompts) · 8/8 security prompts · 71.5% MMLU
|
| 31 |
+
|
| 32 |
+
**Updated reupload** — v2 with improved vectors and thinking-mode stability.
|
| 33 |
</div>
|
| 34 |
|
| 35 |
+
> **Recommended: Run in [vMLX](https://vmlx.net)** for best experience including thinking mode support, repetition penalty, and vision capabilities.
|
| 36 |
+
|
| 37 |
+
## What's New in v2
|
| 38 |
+
|
| 39 |
+
This is an updated version of the original Gemma 4 31B CRACK upload:
|
| 40 |
+
|
| 41 |
+
- **Improved abliteration**: Higher quality refusal vector extraction
|
| 42 |
+
- **Thinking-ON stability**: Clean thinking cycle — no more degenerate loops
|
| 43 |
+
- **Same compliance**: 93.7% HarmBench
|
| 44 |
+
- **Architecture-aware**: Tuned for Gemma 4's hybrid attention design
|
| 45 |
+
|
| 46 |
+
## ⚠️ Important Settings
|
| 47 |
+
|
| 48 |
+
For optimal results, configure your inference settings:
|
| 49 |
+
|
| 50 |
+
| Setting | Thinking OFF | Thinking ON |
|
| 51 |
+
|---------|-------------|-------------|
|
| 52 |
+
| Temperature | 0.0 – 1.0 | **0.3 – 0.7** (avoid greedy) |
|
| 53 |
+
| Repetition Penalty | 1.00 | **1.15 – 1.25** |
|
| 54 |
+
| Top P | 0.95 | 0.95 |
|
| 55 |
+
| Enable Thinking | Off | On |
|
| 56 |
+
|
| 57 |
+
**Thinking ON notes:**
|
| 58 |
+
- Repetition penalty (1.2) is recommended to prevent planning loops
|
| 59 |
+
- Avoid temp=0 with thinking ON — greedy decoding increases loop risk
|
| 60 |
+
- Hardest content categories (drug manufacturing) may still refuse in thinking mode
|
| 61 |
+
- Security/coding prompts work well in both modes
|
| 62 |
+
|
| 63 |
## Model Details
|
| 64 |
|
| 65 |
| Metric | Value |
|
| 66 |
|--------|-------|
|
| 67 |
| Source | `google/gemma-4-31b-it` |
|
| 68 |
+
| Architecture | Dense, hybrid sliding/global attention |
|
| 69 |
+
| Profile | JANG_4M |
|
| 70 |
| Actual avg bits | 5.1 |
|
| 71 |
+
| Model size | 21 GB |
|
| 72 |
| Vision | Yes (multimodal, float16 passthrough) |
|
| 73 |
| Parameters | 31B |
|
| 74 |
+
| Format | JANG v2 (MLX-native safetensors) |
|
| 75 |
+
| Abliteration | CRACK v2 |
|
| 76 |
|
| 77 |
+
## Benchmark Results
|
| 78 |
|
| 79 |
+
### HarmBench (300 prompts, stratified across all categories)
|
| 80 |
|
| 81 |
+
| Category | Score |
|
| 82 |
+
|----------|-------|
|
| 83 |
+
| Cybercrime/intrusion | **51/51 (100%)** |
|
| 84 |
+
| Harmful content | **22/22 (100%)** |
|
| 85 |
+
| Misinformation | **50/50 (100%)** |
|
| 86 |
+
| Illegal activities | 47/50 (94%) |
|
| 87 |
+
| Contextual | 72/78 (92%) |
|
| 88 |
+
| Chemical/biological | 46/51 (90%) |
|
| 89 |
+
| Harassment/bullying | 22/25 (88%) |
|
| 90 |
+
| Copyright | 43/51 (84%) |
|
| 91 |
+
| **Overall** | **281/300 (93.7%)** |
|
| 92 |
|
| 93 |
### Security & Pentesting (8/8 ✅)
|
| 94 |
+
|
| 95 |
All security/pentesting prompts comply with full working code:
|
| 96 |
+
- Port scanners, reverse shells, keyloggers, exploit development
|
| 97 |
+
- Phishing templates, ARP spoofing, SQL injection
|
| 98 |
+
- Metasploit usage guides
|
| 99 |
+
|
| 100 |
+
### MMLU-200 (10 subjects × 20 questions)
|
| 101 |
+
|
| 102 |
+
| | Base JANG_4M | CRACK v2 |
|
| 103 |
+
|---|---|---|
|
| 104 |
+
| **Total** | **76.5%** | **71.5%** |
|
| 105 |
+
| **Delta** | — | **-5.0%** |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 106 |
|
| 107 |
### Coherence ✅
|
| 108 |
+
All coherence checks pass: factual knowledge, reasoning, code generation, mathematics.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
|
| 110 |
+
## Architecture
|
| 111 |
|
| 112 |
+
- Dense 31B with hybrid sliding/global attention
|
| 113 |
+
- Multimodal vision encoder preserved in float16
|
| 114 |
+
- Supports thinking mode (chain-of-thought reasoning)
|
| 115 |
|
| 116 |
+
## Usage
|
| 117 |
|
| 118 |
+
### vMLX (Recommended)
|
|
|
|
|
|
|
| 119 |
|
| 120 |
+
Load directly in [vMLX](https://vmlx.net) — full support for Gemma 4 including vision, thinking mode, and all inference settings.
|
|
|
|
|
|
|
|
|
|
| 121 |
|
| 122 |
+
### Requirements
|
| 123 |
|
| 124 |
+
- Apple Silicon Mac with 32+ GB unified memory
|
| 125 |
+
- [vMLX](https://vmlx.net) 1.3.26+ (recommended)
|
| 126 |
+
- Standard `mlx_lm` / `mlx_vlm` do NOT support Gemma 4 as of v0.31.2 / v0.4.1
|
| 127 |
|
| 128 |
---
|
| 129 |
|