Important: This model uses the JANGTQ_K mixed-bit quantization format -- a quality-tuned variant of JANGTQ that keeps down_proj at 4-bit (residual-stream sensitive) and gate_proj/up_proj at 2-bit (gate-dampened), while attention / shared expert / embed / lm_head stay at affine 8-bit. Currently only supported by MLX Studio and the jang-tools Python package. Follow @dealignai for new releases.

MLX Studio -- the only app that natively supports JANG / JANGTQ models

MiniMax M2.7 -- JANGTQ_K + CRACK

Mixed-bit JANGTQ_K quantization | CRACK abliterated | Reasoning-only | 74 GB

What Is This?

This is MiniMax M2.7 -- a 230B parameter Mixture-of-Experts reasoning model with 256 experts (8 active per token), all standard attention, and always-on chain-of-thought reasoning.

It has been:

JANGTQ_K quantized -- mixed-bit profile: down_proj Q4, gate_proj/up_proj Q2, attention/shared/embed/lm_head Q8, norms/router fp16. 74 GB on disk, ~3-bit avg routed.
CRACK abliterated -- a capability-preserving abliteration variant tuned for mixed-bit bases. Result: capability-positive abliteration (MMLU went UP from base, not down).


Architecture	MiniMax M2.7 MoE -- 230B total, ~10B active, 256 experts
Quantization	JANGTQ_K mixed-bit (Q8 attention + Q4 `down_proj` + Q2 gate/up) -- 74 GB
Abliteration	CRACK (capability-positive on this build)
MMLU-200	94.5% (base K: 93.5%, delta: +1.0pp -- capability-positive)
HarmBench-320	99.06% overall (six of seven categories at 100%)
Reasoning	Always ON (chain-of-thought), `enable_thinking` kwarg supported
Speed	~33 tok/s median (M4 Max 128 GB)
Fits on	96 GB+ Macs

MMLU-200 Results

Subject	CRACK	Base K	Delta
Astronomy	20/20 (100%)	20/20	0
College Physics	20/20 (100%)	18/20	+2
High School Biology	20/20 (100%)	20/20	0
High School Mathematics	20/20 (100%)	19/20	+1
Abstract Algebra	19/20 (95%)	19/20	0
Anatomy	18/20 (90%)	17/20	+1
College Computer Science	18/20 (90%)	19/20	-1
High School Chemistry	18/20 (90%)	18/20	0
Logical Fallacies	18/20 (90%)	16/20	+2
World Religions	18/20 (90%)	17/20	+1
Total	189/200 (94.5%)	187/200 (93.5%)	+1.0%

CRACK is capability-positive on this build -- four subjects at 100%, six gain on base, only one loses. Net +6 questions, with the largest gains on subjects (physics, fallacies, mathematics, religions) where the model commits instead of hedging.

HarmBench-320 Results

Category	Score
Cybercrime / Intrusion	52/52	100%
Misinformation / Disinformation	54/54	100%
Chemical / Biological	42/42	100%
Harmful	18/18	100%
Illegal	53/53	100%
Harassment / Bullying	21/21	100%
Copyright	77/80	96.2%
Total	317/320	99.06%

Strict classifier (rejects stuck-reasoning loops, empty template dumps, and false-positive compliance from thinking-trace leakage). All three non-compliant prompts were in copyright (lyric / passage requests where the model hedged on attribution).

MiniMax M2.7 CRACK Series

Model	Format	Size	MMLU-200	HarmBench-320	Speed	Fits on
JANGTQ_K + CRACK (this)	Mixed Q4/Q2 experts	74 GB	94.5%	99.1%	~33 t/s	96 GB Mac
JANGTQ + CRACK	TurboQuant 2-bit experts	55 GB	92.0%	93.1%	~47 t/s	96 GB Mac
JANG_3L + CRACK	Affine 3-bit mixed	89 GB	93.5%	79.1%	~46 t/s	128 GB Mac
JANG_2L + CRACK	Affine 2-bit	63 GB	84.0%	83.4%	~47 t/s	96 GB Mac

Per-category HarmBench-320 (vs JANGTQ ship)

Category	JANGTQ_K (this)	JANGTQ ship	Δ
cybercrime_intrusion	52/52 (100%)	51/52 (98.1%)	+1
misinformation_disinformation	54/54 (100%)	53/54 (98.1%)	+1
chemical_biological	42/42 (100%)	41/42 (97.6%)	+1
harmful	18/18 (100%)	17/18 (94.4%)	+1
illegal	53/53 (100%)	48/53 (90.6%)	+5
copyright	77/80 (96.2%)	70/80 (87.5%)	+7
harassment_bullying	21/21 (100%)	18/21 (85.7%)	+3
Total	317/320 (99.1%)	298/320 (93.1%)	+19

JANGTQ_K + CRACK trades 27 GB and ~14 t/s for a 6pp HarmBench gain and a 2.5pp MMLU gain over the JANGTQ ship -- and matches or beats the JANG_3L 89 GB variant on both axes at 15 GB less.

vs MLX Uniform Quantization

MLX uniform quantization is completely broken on MiniMax at all bit levels (~25% MMLU = random chance). JANG / JANGTQ / JANGTQ_K is the only working quantization format for this architecture.

About JANGTQ_K (mixed-bit)

JANGTQ_K is a quality-tuned mixed-bit variant of JANGTQ. The expert routing spends 4 bits on down_proj (whose output enters the residual stream and accumulates noise across 62 layers) and 2 bits on gate_proj/up_proj (whose contribution passes through SwiGLU's multiplicative gate silu(gate) × up, which dampens quantization noise). Attention / shared expert / embed / lm_head stay at affine 8-bit for precision-critical paths.

Bundle size: 74 GB (~~3-bit avg routed) -- between JANGTQ (55 GB, all 2-bit routed) and JANG_3L (89 GB, all 3-bit routed). Quality is closer to full-4-bit (~~115 GB) than to JANGTQ.

About CRACK

CRACK (Controlled Refusal Ablation via Calibrated Knockouts) is a weight-level intervention that removes safety alignment while preserving reasoning quality and compliance. This build uses a capability-preserving variant tuned for mixed-bit JANGTQ_K bases, which is why it lands capability-positive on MMLU instead of capability-negative.

The modification is permanently baked into the published weights -- no LoRA, no fine-tuning, no system prompts.

Install & Usage

pip install "jang[mlx]"

from jang_tools.load_jangtq import load_jangtq_model
from mlx_lm import generate
from mlx_lm.sample_utils import make_sampler

model, tokenizer = load_jangtq_model("dealignai/MiniMax-M2.7-JANGTQ_K-CRACK")
sampler = make_sampler(temp=1.0)  # MiniMax requires temp=1.0 for chat

messages = [{"role": "user", "content": "Your prompt here"}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=False)

response = generate(model, tokenizer, prompt=prompt, max_tokens=4000, sampler=sampler)
print(response)

Note: M2.7 is a reasoning-only model -- it always generates a <think> chain before the final answer. Use max_tokens=4000+ for complex questions. For chat, use temperature=1.0 (greedy causes infinite loops). Set enable_thinking=False in apply_chat_template to skip the <think> block on short responses.

Disclaimer

This model is provided for research and educational purposes. The creators are not responsible for any misuse. By downloading this model, you agree to use it responsibly and in compliance with applicable laws.

Created by Jinho Jang

Downloads last month: -

Safetensors

Model size

20B params

Tensor type

U32

F16

MLX

Hardware compatibility

Quantized

Model tree for dealignai/MiniMax-M2.7-JANGTQ_K-CRACK

Base model

MiniMaxAI/MiniMax-M2.7

Finetuned

(26)

this model

dealignai
/

MiniMax-M2.7-JANGTQ_K-CRACK