Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -25,7 +25,7 @@ tags:
|
|
| 25 |
</p>
|
| 26 |
|
| 27 |
<h3 align="center">MiniMax M2.7 Small — 138B-A10B — JANGTQ (MLX)</h3>
|
| 28 |
-
<p align="center"><b>This is now a ~138B-A10B MoE</b> (down from MiniMax M2's 230B base) — 40% routed-expert prune + 2-bit JANGTQ quantization. Distilled from MiniMax M2 via REAP saliency + JANGTQ2 codebook quantization — routed experts at 2-bit via Lloyd-Max codebooks + Hadamard rotation, attention / embed / lm_head / dense MLP at 8-bit affine, norms and router at 16-bit.</p>
|
| 29 |
|
| 30 |
<p align="center">
|
| 31 |
<a href="https://osaurus.ai"><img src="https://img.shields.io/badge/Web-osaurus.ai-blue" alt="Website"></a>
|
|
|
|
| 25 |
</p>
|
| 26 |
|
| 27 |
<h3 align="center">MiniMax M2.7 Small — 138B-A10B — JANGTQ (MLX)</h3>
|
| 28 |
+
<p align="center"><b>This is now a ~138B-A10B MoE — 38 GB on disk</b> (down from MiniMax M2's ~460 GB / 230B base) — 40% routed-expert prune + 2-bit JANGTQ quantization. Distilled from MiniMax M2 via REAP saliency + JANGTQ2 codebook quantization — routed experts at 2-bit via Lloyd-Max codebooks + Hadamard rotation, attention / embed / lm_head / dense MLP at 8-bit affine, norms and router at 16-bit.</p>
|
| 29 |
|
| 30 |
<p align="center">
|
| 31 |
<a href="https://osaurus.ai"><img src="https://img.shields.io/badge/Web-osaurus.ai-blue" alt="Website"></a>
|