Text Generation
MLX
Safetensors
mimo_v2_flash
jang
jang-quantized
JANG_2S
mixed-precision
apple-silicon
conversational
custom_code
Instructions to use bearzi/MiMo-V2-Flash-JANG_2S with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use bearzi/MiMo-V2-Flash-JANG_2S with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("bearzi/MiMo-V2-Flash-JANG_2S") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- Pi new
How to use bearzi/MiMo-V2-Flash-JANG_2S with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "bearzi/MiMo-V2-Flash-JANG_2S"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "bearzi/MiMo-V2-Flash-JANG_2S" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use bearzi/MiMo-V2-Flash-JANG_2S with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "bearzi/MiMo-V2-Flash-JANG_2S"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default bearzi/MiMo-V2-Flash-JANG_2S
Run Hermes
hermes
- MLX LM
How to use bearzi/MiMo-V2-Flash-JANG_2S with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "bearzi/MiMo-V2-Flash-JANG_2S"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "bearzi/MiMo-V2-Flash-JANG_2S" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bearzi/MiMo-V2-Flash-JANG_2S", "messages": [ {"role": "user", "content": "Hello"} ] }'
Upload MiMo-V2-Flash-JANG_2S
Browse filesThis view is limited to 50 files because it contains too many changes. See raw diff
- .gitattributes +1 -0
- README.md +51 -0
- added_tokens.json +28 -0
- config.json +156 -0
- configuration_mimo_v2_flash.py +109 -0
- jang_config.json +49 -0
- merges.txt +0 -0
- model-00001-of-00144.safetensors +3 -0
- model-00002-of-00144.safetensors +3 -0
- model-00003-of-00144.safetensors +3 -0
- model-00004-of-00144.safetensors +3 -0
- model-00005-of-00144.safetensors +3 -0
- model-00006-of-00144.safetensors +3 -0
- model-00007-of-00144.safetensors +3 -0
- model-00008-of-00144.safetensors +3 -0
- model-00009-of-00144.safetensors +3 -0
- model-00010-of-00144.safetensors +3 -0
- model-00011-of-00144.safetensors +3 -0
- model-00012-of-00144.safetensors +3 -0
- model-00013-of-00144.safetensors +3 -0
- model-00014-of-00144.safetensors +3 -0
- model-00015-of-00144.safetensors +3 -0
- model-00016-of-00144.safetensors +3 -0
- model-00017-of-00144.safetensors +3 -0
- model-00018-of-00144.safetensors +3 -0
- model-00019-of-00144.safetensors +3 -0
- model-00020-of-00144.safetensors +3 -0
- model-00021-of-00144.safetensors +3 -0
- model-00022-of-00144.safetensors +3 -0
- model-00023-of-00144.safetensors +3 -0
- model-00024-of-00144.safetensors +3 -0
- model-00025-of-00144.safetensors +3 -0
- model-00026-of-00144.safetensors +3 -0
- model-00027-of-00144.safetensors +3 -0
- model-00028-of-00144.safetensors +3 -0
- model-00029-of-00144.safetensors +3 -0
- model-00030-of-00144.safetensors +3 -0
- model-00031-of-00144.safetensors +3 -0
- model-00032-of-00144.safetensors +3 -0
- model-00033-of-00144.safetensors +3 -0
- model-00034-of-00144.safetensors +3 -0
- model-00035-of-00144.safetensors +3 -0
- model-00036-of-00144.safetensors +3 -0
- model-00037-of-00144.safetensors +3 -0
- model-00038-of-00144.safetensors +3 -0
- model-00039-of-00144.safetensors +3 -0
- model-00040-of-00144.safetensors +3 -0
- model-00041-of-00144.safetensors +3 -0
- model-00042-of-00144.safetensors +3 -0
- model-00043-of-00144.safetensors +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
base_model: XiaomiMiMo/MiMo-V2-Flash
|
| 3 |
+
library_name: mlx
|
| 4 |
+
pipeline_tag: text-generation
|
| 5 |
+
license: apache-2.0
|
| 6 |
+
tags:
|
| 7 |
+
- mlx
|
| 8 |
+
- jang
|
| 9 |
+
- jang-quantized
|
| 10 |
+
- JANG_2S
|
| 11 |
+
- mixed-precision
|
| 12 |
+
- apple-silicon
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# MiMo-V2-Flash-JANG_2S
|
| 16 |
+
|
| 17 |
+
JANG adaptive mixed-precision MLX quantization produced via [vmlx / jang-tools](https://github.com/jjang-ai/jangq).
|
| 18 |
+
|
| 19 |
+
- **Quantization:** 3.06b avg, profile JANG_2S, method mse, calibration weights
|
| 20 |
+
- **Profile:** JANG_2S
|
| 21 |
+
- **Format:** JANG v2 MLX safetensors
|
| 22 |
+
- **Compatible with:** vmlx, MLX Studio, oMLX (with JANG patch)
|
| 23 |
+
|
| 24 |
+
## Usage
|
| 25 |
+
|
| 26 |
+
### vmlx (recommended)
|
| 27 |
+
|
| 28 |
+
```bash
|
| 29 |
+
pip install 'vmlx[jang]'
|
| 30 |
+
vmlx serve bearzi/MiMo-V2-Flash-JANG_2S
|
| 31 |
+
```
|
| 32 |
+
|
| 33 |
+
### Python
|
| 34 |
+
|
| 35 |
+
```python
|
| 36 |
+
from jang_tools.loader import load_jang_model
|
| 37 |
+
from mlx_lm import generate
|
| 38 |
+
|
| 39 |
+
model, tokenizer = load_jang_model("bearzi/MiMo-V2-Flash-JANG_2S")
|
| 40 |
+
messages = [{"role": "user", "content": "Hello"}]
|
| 41 |
+
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
|
| 42 |
+
print(generate(model, tokenizer, prompt=prompt, max_tokens=512, verbose=True))
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
## About JANG
|
| 46 |
+
|
| 47 |
+
JANG (Jang Adaptive N-bit Grading) assigns different bit widths to different layer types — attention layers get more bits, MLP/expert layers compress harder. This preserves model coherence at aggressive compression levels where uniform quantization breaks down.
|
| 48 |
+
|
| 49 |
+
See [JANG documentation](https://github.com/jjang-ai/jangq) and scores at [jangq.ai](https://jangq.ai).
|
| 50 |
+
|
| 51 |
+
Comparative benchmarks and feedback welcome — please open a discussion.
|
added_tokens.json
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"</think>": 151668,
|
| 3 |
+
"</tool_call>": 151658,
|
| 4 |
+
"</tool_response>": 151666,
|
| 5 |
+
"<think>": 151667,
|
| 6 |
+
"<tool_call>": 151657,
|
| 7 |
+
"<tool_response>": 151665,
|
| 8 |
+
"<|box_end|>": 151649,
|
| 9 |
+
"<|box_start|>": 151648,
|
| 10 |
+
"<|endoftext|>": 151643,
|
| 11 |
+
"<|file_sep|>": 151664,
|
| 12 |
+
"<|fim_middle|>": 151660,
|
| 13 |
+
"<|fim_pad|>": 151662,
|
| 14 |
+
"<|fim_prefix|>": 151659,
|
| 15 |
+
"<|fim_suffix|>": 151661,
|
| 16 |
+
"<|im_end|>": 151645,
|
| 17 |
+
"<|im_start|>": 151644,
|
| 18 |
+
"<|image_pad|>": 151655,
|
| 19 |
+
"<|object_ref_end|>": 151647,
|
| 20 |
+
"<|object_ref_start|>": 151646,
|
| 21 |
+
"<|quad_end|>": 151651,
|
| 22 |
+
"<|quad_start|>": 151650,
|
| 23 |
+
"<|repo_name|>": 151663,
|
| 24 |
+
"<|video_pad|>": 151656,
|
| 25 |
+
"<|vision_end|>": 151653,
|
| 26 |
+
"<|vision_pad|>": 151654,
|
| 27 |
+
"<|vision_start|>": 151652
|
| 28 |
+
}
|
config.json
ADDED
|
@@ -0,0 +1,156 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"MiMoV2FlashForCausalLM"
|
| 4 |
+
],
|
| 5 |
+
"auto_map": {
|
| 6 |
+
"AutoConfig": "configuration_mimo_v2_flash.MiMoV2FlashConfig",
|
| 7 |
+
"AutoModel": "modeling_mimo_v2_flash.MiMoV2FlashModel",
|
| 8 |
+
"AutoModelForCausalLM": "modeling_mimo_v2_flash.MiMoV2FlashForCausalLM"
|
| 9 |
+
},
|
| 10 |
+
"attention_dropout": 0.0,
|
| 11 |
+
"attention_value_scale": 0.707,
|
| 12 |
+
"hidden_act": "silu",
|
| 13 |
+
"hidden_size": 4096,
|
| 14 |
+
"initializer_range": 0.02,
|
| 15 |
+
"intermediate_size": 16384,
|
| 16 |
+
"max_position_embeddings": 262144,
|
| 17 |
+
"model_type": "mimo_v2_flash",
|
| 18 |
+
"num_attention_heads": 64,
|
| 19 |
+
"head_dim": 192,
|
| 20 |
+
"num_hidden_layers": 48,
|
| 21 |
+
"num_key_value_heads": 4,
|
| 22 |
+
"layernorm_epsilon": 1e-05,
|
| 23 |
+
"rope_theta": 5000000,
|
| 24 |
+
"tie_word_embeddings": false,
|
| 25 |
+
"torch_dtype": "bfloat16",
|
| 26 |
+
"transformers_version": "4.40.1",
|
| 27 |
+
"use_cache": true,
|
| 28 |
+
"vocab_size": 152576,
|
| 29 |
+
"partial_rotary_factor": 0.334,
|
| 30 |
+
"sliding_window": 128,
|
| 31 |
+
"swa_rope_theta": 10000,
|
| 32 |
+
"attention_bias": false,
|
| 33 |
+
"v_head_dim": 128,
|
| 34 |
+
"hybrid_layer_pattern": [
|
| 35 |
+
0,
|
| 36 |
+
1,
|
| 37 |
+
1,
|
| 38 |
+
1,
|
| 39 |
+
1,
|
| 40 |
+
0,
|
| 41 |
+
1,
|
| 42 |
+
1,
|
| 43 |
+
1,
|
| 44 |
+
1,
|
| 45 |
+
1,
|
| 46 |
+
0,
|
| 47 |
+
1,
|
| 48 |
+
1,
|
| 49 |
+
1,
|
| 50 |
+
1,
|
| 51 |
+
1,
|
| 52 |
+
0,
|
| 53 |
+
1,
|
| 54 |
+
1,
|
| 55 |
+
1,
|
| 56 |
+
1,
|
| 57 |
+
1,
|
| 58 |
+
0,
|
| 59 |
+
1,
|
| 60 |
+
1,
|
| 61 |
+
1,
|
| 62 |
+
1,
|
| 63 |
+
1,
|
| 64 |
+
0,
|
| 65 |
+
1,
|
| 66 |
+
1,
|
| 67 |
+
1,
|
| 68 |
+
1,
|
| 69 |
+
1,
|
| 70 |
+
0,
|
| 71 |
+
1,
|
| 72 |
+
1,
|
| 73 |
+
1,
|
| 74 |
+
1,
|
| 75 |
+
1,
|
| 76 |
+
0,
|
| 77 |
+
1,
|
| 78 |
+
1,
|
| 79 |
+
1,
|
| 80 |
+
1,
|
| 81 |
+
1,
|
| 82 |
+
0
|
| 83 |
+
],
|
| 84 |
+
"add_swa_attention_sink_bias": true,
|
| 85 |
+
"add_full_attention_sink_bias": false,
|
| 86 |
+
"sliding_window_size": 128,
|
| 87 |
+
"attention_chunk_size": 128,
|
| 88 |
+
"moe_layer_freq": [
|
| 89 |
+
0,
|
| 90 |
+
1,
|
| 91 |
+
1,
|
| 92 |
+
1,
|
| 93 |
+
1,
|
| 94 |
+
1,
|
| 95 |
+
1,
|
| 96 |
+
1,
|
| 97 |
+
1,
|
| 98 |
+
1,
|
| 99 |
+
1,
|
| 100 |
+
1,
|
| 101 |
+
1,
|
| 102 |
+
1,
|
| 103 |
+
1,
|
| 104 |
+
1,
|
| 105 |
+
1,
|
| 106 |
+
1,
|
| 107 |
+
1,
|
| 108 |
+
1,
|
| 109 |
+
1,
|
| 110 |
+
1,
|
| 111 |
+
1,
|
| 112 |
+
1,
|
| 113 |
+
1,
|
| 114 |
+
1,
|
| 115 |
+
1,
|
| 116 |
+
1,
|
| 117 |
+
1,
|
| 118 |
+
1,
|
| 119 |
+
1,
|
| 120 |
+
1,
|
| 121 |
+
1,
|
| 122 |
+
1,
|
| 123 |
+
1,
|
| 124 |
+
1,
|
| 125 |
+
1,
|
| 126 |
+
1,
|
| 127 |
+
1,
|
| 128 |
+
1,
|
| 129 |
+
1,
|
| 130 |
+
1,
|
| 131 |
+
1,
|
| 132 |
+
1,
|
| 133 |
+
1,
|
| 134 |
+
1,
|
| 135 |
+
1,
|
| 136 |
+
1
|
| 137 |
+
],
|
| 138 |
+
"moe_intermediate_size": 2048,
|
| 139 |
+
"n_routed_experts": 256,
|
| 140 |
+
"n_shared_experts": null,
|
| 141 |
+
"num_experts_per_tok": 8,
|
| 142 |
+
"norm_topk_prob": true,
|
| 143 |
+
"scoring_func": "sigmoid",
|
| 144 |
+
"n_group": 1,
|
| 145 |
+
"topk_group": 1,
|
| 146 |
+
"topk_method": "noaux_tc",
|
| 147 |
+
"routed_scaling_factor": null,
|
| 148 |
+
"swa_num_attention_heads": 64,
|
| 149 |
+
"swa_num_key_value_heads": 8,
|
| 150 |
+
"swa_head_dim": 192,
|
| 151 |
+
"swa_v_head_dim": 128,
|
| 152 |
+
"quantization": {
|
| 153 |
+
"group_size": 128,
|
| 154 |
+
"bits": 2
|
| 155 |
+
}
|
| 156 |
+
}
|
configuration_mimo_v2_flash.py
ADDED
|
@@ -0,0 +1,109 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# coding=utf-8
|
| 2 |
+
#
|
| 3 |
+
# Copyright 2025 Xiaomi Corporation.
|
| 4 |
+
# Copyright 2025 The HuggingFace Inc. team.
|
| 5 |
+
#
|
| 6 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
| 7 |
+
# you may not use this file except in compliance with the License.
|
| 8 |
+
# You may obtain a copy of the License at
|
| 9 |
+
#
|
| 10 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
| 11 |
+
#
|
| 12 |
+
# Unless required by applicable law or agreed to in writing, software
|
| 13 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
| 14 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
| 15 |
+
# See the License for the specific language governing permissions and
|
| 16 |
+
# limitations under the License.
|
| 17 |
+
|
| 18 |
+
from transformers.configuration_utils import PretrainedConfig
|
| 19 |
+
from transformers.modeling_rope_utils import rope_config_validation
|
| 20 |
+
from transformers.utils import logging
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
logger = logging.get_logger(__name__)
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
class MiMoV2FlashConfig(PretrainedConfig):
|
| 27 |
+
|
| 28 |
+
model_type = ""
|
| 29 |
+
keys_to_ignore_at_inference = ["past_key_values"]
|
| 30 |
+
|
| 31 |
+
# Default tensor parallel plan for base model `Hybrid`
|
| 32 |
+
base_model_tp_plan = {
|
| 33 |
+
"layers.*.self_attn.q_proj": "colwise",
|
| 34 |
+
"layers.*.self_attn.k_proj": "colwise",
|
| 35 |
+
"layers.*.self_attn.v_proj": "colwise",
|
| 36 |
+
"layers.*.self_attn.o_proj": "rowwise",
|
| 37 |
+
"layers.*.mlp.gate_proj": "colwise",
|
| 38 |
+
"layers.*.mlp.up_proj": "colwise",
|
| 39 |
+
"layers.*.mlp.down_proj": "rowwise",
|
| 40 |
+
}
|
| 41 |
+
base_model_pp_plan = {
|
| 42 |
+
"embed_tokens": (["input_ids"], ["inputs_embeds"]),
|
| 43 |
+
"layers": (["hidden_states", "attention_mask"], ["hidden_states"]),
|
| 44 |
+
"norm": (["hidden_states"], ["hidden_states"]),
|
| 45 |
+
}
|
| 46 |
+
|
| 47 |
+
attribute_map = {
|
| 48 |
+
"num_local_experts": "n_routed_experts",
|
| 49 |
+
}
|
| 50 |
+
|
| 51 |
+
def __init__(
|
| 52 |
+
self,
|
| 53 |
+
vocab_size=151936,
|
| 54 |
+
hidden_size=4096,
|
| 55 |
+
intermediate_size=22016,
|
| 56 |
+
num_hidden_layers=32,
|
| 57 |
+
num_attention_heads=32,
|
| 58 |
+
num_key_value_heads=32,
|
| 59 |
+
hidden_act="silu",
|
| 60 |
+
max_position_embeddings=32768,
|
| 61 |
+
initializer_range=0.02,
|
| 62 |
+
layernorm_epsilon=1e-6,
|
| 63 |
+
use_cache=True,
|
| 64 |
+
tie_word_embeddings=False,
|
| 65 |
+
rope_theta=10000.0,
|
| 66 |
+
rope_scaling=None,
|
| 67 |
+
attention_dropout=0.0,
|
| 68 |
+
hybrid_block_size=None,
|
| 69 |
+
hybrid_layer_pattern=None,
|
| 70 |
+
partial_rotary_factor=1.0,
|
| 71 |
+
**kwargs,
|
| 72 |
+
):
|
| 73 |
+
self.vocab_size = vocab_size
|
| 74 |
+
self.max_position_embeddings = max_position_embeddings
|
| 75 |
+
self.hidden_size = hidden_size
|
| 76 |
+
self.intermediate_size = intermediate_size
|
| 77 |
+
self.num_hidden_layers = num_hidden_layers
|
| 78 |
+
self.num_attention_heads = num_attention_heads
|
| 79 |
+
|
| 80 |
+
# for backward compatibility
|
| 81 |
+
if num_key_value_heads is None:
|
| 82 |
+
num_key_value_heads = num_attention_heads
|
| 83 |
+
|
| 84 |
+
self.num_key_value_heads = num_key_value_heads
|
| 85 |
+
self.hidden_act = hidden_act
|
| 86 |
+
self.initializer_range = initializer_range
|
| 87 |
+
self.layernorm_epsilon = layernorm_epsilon
|
| 88 |
+
self.use_cache = use_cache
|
| 89 |
+
self.rope_theta = rope_theta
|
| 90 |
+
self.rope_scaling = rope_scaling
|
| 91 |
+
self.attention_dropout = attention_dropout
|
| 92 |
+
|
| 93 |
+
if hybrid_block_size is not None and hybrid_layer_pattern is None:
|
| 94 |
+
hybrid_layer_pattern = [0 if ((i + 1) % hybrid_block_size == 0) else 1 for i in range(num_hidden_layers)]
|
| 95 |
+
self.hybrid_block_size = hybrid_block_size
|
| 96 |
+
self.hybrid_layer_pattern = hybrid_layer_pattern
|
| 97 |
+
|
| 98 |
+
self.partial_rotary_factor = partial_rotary_factor
|
| 99 |
+
|
| 100 |
+
# Validate the correctness of rotary position embeddings parameters
|
| 101 |
+
# BC: if there is a 'type' field, move it to 'rope_type'.
|
| 102 |
+
if self.rope_scaling is not None and "type" in self.rope_scaling:
|
| 103 |
+
self.rope_scaling["rope_type"] = self.rope_scaling["type"]
|
| 104 |
+
rope_config_validation(self)
|
| 105 |
+
|
| 106 |
+
super().__init__(
|
| 107 |
+
tie_word_embeddings=tie_word_embeddings,
|
| 108 |
+
**kwargs,
|
| 109 |
+
)
|
jang_config.json
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"quantization": {
|
| 3 |
+
"method": "jang-importance",
|
| 4 |
+
"profile": "JANG_2S",
|
| 5 |
+
"target_bits": 2,
|
| 6 |
+
"actual_bits": 3.06,
|
| 7 |
+
"block_size": 128,
|
| 8 |
+
"calibration_method": "weights",
|
| 9 |
+
"quantization_method": "mse",
|
| 10 |
+
"scoring_method": "weight-magnitude",
|
| 11 |
+
"bit_widths_used": [
|
| 12 |
+
2,
|
| 13 |
+
3,
|
| 14 |
+
4,
|
| 15 |
+
6
|
| 16 |
+
],
|
| 17 |
+
"quantization_scheme": "asymmetric",
|
| 18 |
+
"quantization_backend": "mx.quantize",
|
| 19 |
+
"hadamard_rotation": false
|
| 20 |
+
},
|
| 21 |
+
"source_model": {
|
| 22 |
+
"name": "MiMo-V2-Flash",
|
| 23 |
+
"dtype": "bfloat16",
|
| 24 |
+
"parameters": "13.5B"
|
| 25 |
+
},
|
| 26 |
+
"architecture": {
|
| 27 |
+
"type": "moe",
|
| 28 |
+
"attention": "gqa",
|
| 29 |
+
"has_vision": false,
|
| 30 |
+
"has_ssm": false,
|
| 31 |
+
"has_moe": true
|
| 32 |
+
},
|
| 33 |
+
"runtime": {
|
| 34 |
+
"total_weight_bytes": 171442176,
|
| 35 |
+
"total_weight_gb": 0.16
|
| 36 |
+
},
|
| 37 |
+
"capabilities": {
|
| 38 |
+
"reasoning_parser": "deepseek_r1",
|
| 39 |
+
"tool_parser": "qwen",
|
| 40 |
+
"think_in_template": true,
|
| 41 |
+
"supports_tools": true,
|
| 42 |
+
"supports_thinking": true,
|
| 43 |
+
"family": "mimo_v2_flash",
|
| 44 |
+
"modality": "text",
|
| 45 |
+
"cache_type": "kv"
|
| 46 |
+
},
|
| 47 |
+
"format": "jang",
|
| 48 |
+
"format_version": "2.0"
|
| 49 |
+
}
|
merges.txt
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
model-00001-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:acd355415b679211d7fc9973cc5e32e46704d2a57f17eae6190cc150cbbc21cd
|
| 3 |
+
size 1515230624
|
model-00002-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c246d29f8d987b278ff5fc11fe94962cf901534970404623008902e66464a9ed
|
| 3 |
+
size 603980184
|
model-00003-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2b22a4c5ecb5b760d4b77056574f78bc16985fee930d2b2fef3bf8230bdc7268
|
| 3 |
+
size 872415648
|
model-00004-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e82d3deab97b0617cdbc678553eab49e5938fa023594f10ac16d73e13c72f590
|
| 3 |
+
size 1210484488
|
model-00005-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0fd63445d471ad163773712b4321c07f6410e82514f9061843768aadfde94016
|
| 3 |
+
size 603980184
|
model-00006-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ac31479ea1232d8f3f417a998873d7d6547fdc6aa3f500974971daf26344bad5
|
| 3 |
+
size 872415648
|
model-00007-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cc8f5b54da981ec34d5281323b41c34e677cb1dfedd92989c579732600ba6643
|
| 3 |
+
size 1214580496
|
model-00008-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:744e9e4dd318f15390995525b1a4595ae643bf0658a7e22e9e47e5a0a3182a11
|
| 3 |
+
size 603980184
|
model-00009-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c7360f4629948aa08a3f67d9a9ec79321f464f8710e4e43f9dd21c5a5873aa0b
|
| 3 |
+
size 872415648
|
model-00010-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6586cc0aa868620cc83b7a9990dce2c516496abd2c2c191149e0713266c7b89a
|
| 3 |
+
size 1214580496
|
model-00011-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1a004cc631fdfe4d46bb8472cf7d447965c13c1188afd89758ae3a2e8329dca4
|
| 3 |
+
size 603980184
|
model-00012-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f41572d2d067113c55e54bb210fdbb30936c0a818497ca671137360131cb878c
|
| 3 |
+
size 872415648
|
model-00013-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2230834e95bca8b491a721929a8fe63e6aca9121e82ff30d7419f0100d380fd1
|
| 3 |
+
size 1214580496
|
model-00014-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:657b69928802ef0009ce42b05de37ac771327196a4745cd73b37407c89c409b4
|
| 3 |
+
size 603980184
|
model-00015-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:402de73971cdd75864f7a437eb0c2d2d201f05480b9d176eb1dd221e69eaf505
|
| 3 |
+
size 872415648
|
model-00016-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:38c73412bb20c23d69b35583530c841dad2243332d4f6e8f67952e21ee4c035f
|
| 3 |
+
size 1214580496
|
model-00017-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d8e5c8dd528533ea5ab8cfdef0163dc8b8b3837a05c436e01d73633e2c89e3c1
|
| 3 |
+
size 603980184
|
model-00018-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d33e1b744f721deb3265253cbe4914f9b50452ce003623ce175b9f01f38206eb
|
| 3 |
+
size 872415648
|
model-00019-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a5c86496ddd4dfb9fcd164c432d87ae1d5130e5e2eac09042bb4acffacc45cda
|
| 3 |
+
size 1214580496
|
model-00020-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:95050450049087c76b2cafe976cd4be7616505902d5da51d8529759b5e4aa6cc
|
| 3 |
+
size 603980184
|
model-00021-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6a092306c8a35cb49cdde6dda2e4403d112e8503a8e62474e25002f91dc28f2d
|
| 3 |
+
size 872415648
|
model-00022-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:56a34f2be771da6ccfdbc82719d330ee1db803872dcbf7a9bb65e0f690f4ca92
|
| 3 |
+
size 1210484488
|
model-00023-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a49859979bc08667f40e94f97f784cd9da66c8b0eb1e8944195df292a01185c5
|
| 3 |
+
size 603980184
|
model-00024-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d34ed72cabcedd1d773139d978e38dbe5642f264f8ae93ad5238406a6b447da5
|
| 3 |
+
size 872415648
|
model-00025-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:67d39bf9600a2b47a8d2fabdc33533a600ac8020136cbbc1523a74b8ec25ff4c
|
| 3 |
+
size 1214580496
|
model-00026-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:730bd03f3dcab4c40af8b96ea0870c7b02412ad3dd2069ec99e99b0be8b63793
|
| 3 |
+
size 603980184
|
model-00027-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:80a30bffdfab290cd6ae850cc0fec8c133a151585f29a482f5b66bcf5fc9ed42
|
| 3 |
+
size 872415648
|
model-00028-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1f2e358ea960413f35df5c0799c9ab67d79ae163567939e706fe742ab1975f47
|
| 3 |
+
size 1214580496
|
model-00029-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:29d041834844f29a8814881a3e3ceb06459bb32a3c83b01612fdf6c6f9178700
|
| 3 |
+
size 603980184
|
model-00030-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e1f0ca4653fdf1c203b76b0bccc70d7c4be7dfee7e5359790ff5ad5e8f722649
|
| 3 |
+
size 872415648
|
model-00031-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:84fcf9518d9b5693113ec2e41163a670b42efbeb80d419a7e381ff24a6c98ecf
|
| 3 |
+
size 1140851104
|
model-00032-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0b577eb313f87b4ad94e0a1f37f1640552b0847028e66a47b51a35da8f360e3d
|
| 3 |
+
size 603980176
|
model-00033-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c7b70e297f03c70ef17972ed846a1a399f526a196a6cfa48f235052567f59fde
|
| 3 |
+
size 872415640
|
model-00034-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b8aaf4636df491a06aade56ef9bf57793a86f35916f96fa1fe228646f00a1089
|
| 3 |
+
size 1288309848
|
model-00035-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7b5ce48614a24ff59f0a1e4042cb63a52232a53ff6c083c8dbb0a2f22259b46f
|
| 3 |
+
size 603980184
|
model-00036-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e854c3ff5c66933f5e33a6505aa283eef9dd8421f4a29375b9baedbe245f687d
|
| 3 |
+
size 872415648
|
model-00037-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c6ad2708b6ed8c26fe93290ddb1b613226e104ee11db9ad6aa6f44780499d9b4
|
| 3 |
+
size 1214580496
|
model-00038-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e4415e4eb76238e0cbd5bc5b7388efd984a48ab1913e4697b81d0293b85dfadd
|
| 3 |
+
size 603980184
|
model-00039-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7161a73cb820ede067060cffe802bcf5f802be32c7b4727be02a38e9d2c87546
|
| 3 |
+
size 872415648
|
model-00040-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:518b79a2b069a6efd84addc8aa270970ce019fe358f4bceff8a45126f93b67e3
|
| 3 |
+
size 1214580496
|
model-00041-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6da20e5df989480168abb2daad573cd00d723c13523bcc35ad7740ff335e7c08
|
| 3 |
+
size 603980184
|
model-00042-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ae44c68f49dcaa8e60c1865f8f55cad1e33f234e502afbbb8474067b6a08ed59
|
| 3 |
+
size 872415648
|
model-00043-of-00144.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:368739da9d55fbb7be048a8891b08aec6a6f0639acf3ec890868e76643646bbe
|
| 3 |
+
size 1210484488
|