| --- |
| license: apache-2.0 |
| library_name: transformers |
| base_model: |
| - mistralai/Mistral-Small-3.2-24B-Instruct-2506 |
| - mistralai/Magistral-Small-2509 |
| tags: |
| - merge |
| - mergekit |
| - slerp |
| - mistral |
| - reasoning |
| - 24b |
| language: |
| - en |
| - fr |
| - de |
| - es |
| - it |
| - pt |
| - zh |
| - ja |
| - ko |
| - ar |
| --- |
| |
| # Taipei 2 |
|
|
| A 50/50 SLERP merge of `Mistral-Small-3.2-24B-Instruct-2506` and `Magistral-Small-2509`, both 24B Mistral-3 architecture models sharing the same base. This has resulted in our best model, Taipei 3.1 |
|
|
| The goal: combine the conversational polish, tool-calling reliability, and low-latency response style of Mistral Small 3.2 with the explicit reasoning capability (SFT + RL on Magistral Medium traces) of Magistral Small 1.2. The merged model retains the `[THINK]/[/THINK]` reasoning tokens from Magistral via `tokenizer_source: union`, so it can operate in either fast-response or deep-reasoning mode depending on system prompt. |
|
|
| ## Use |
|
|
| Works with vLLM, transformers, and llama.cpp (after GGUF conversion). Use Magistral's system prompt format to enable reasoning traces; use a standard Mistral system prompt for fast chat. |
|
|
| ## Tokenizer |
|
|
| This repo ships Mistral's canonical `tekken.json` rather than a serialized HF `tokenizer.json`. transformers' `AutoTokenizer.from_pretrained` auto-converts it on load. For best fidelity in production, use [`mistral-common`](https://github.com/mistralai/mistral-common) or vLLM, which read tekken directly. The `[THINK]` / `[/THINK]` reasoning tokens are preserved (ranks 34 / 35). |
|
|
| ## Merge config |
|
|
| ```yaml |
| merge_method: slerp |
| base_model: mistralai/Mistral-Small-3.2-24B-Instruct-2506 |
| slices: |
| - sources: |
| - model: mistralai/Mistral-Small-3.2-24B-Instruct-2506 |
| layer_range: [0, 40] |
| - model: mistralai/Magistral-Small-2509 |
| layer_range: [0, 40] |
| parameters: |
| t: |
| - filter: self_attn |
| value: [0, 0.5, 0.3, 0.7, 1] |
| - filter: mlp |
| value: [1, 0.5, 0.7, 0.3, 0] |
| - value: 0.5 |
| embed_slerp: true |
| dtype: bfloat16 |
| tokenizer_source: union |
| ``` |
|
|
| Part of the Tripplet Taipei model series. |
|
|