taipei3.1 / README.md
alextripplet's picture
Update README.md
8159c2f verified
metadata
license: apache-2.0
library_name: transformers
base_model:
  - mistralai/Mistral-Small-3.2-24B-Instruct-2506
  - mistralai/Magistral-Small-2509
tags:
  - merge
  - mergekit
  - slerp
  - mistral
  - reasoning
  - 24b
language:
  - en
  - fr
  - de
  - es
  - it
  - pt
  - zh
  - ja
  - ko
  - ar

Taipei 2

A 50/50 SLERP merge of Mistral-Small-3.2-24B-Instruct-2506 and Magistral-Small-2509, both 24B Mistral-3 architecture models sharing the same base. This has resulted in our best model, Taipei 3.1

The goal: combine the conversational polish, tool-calling reliability, and low-latency response style of Mistral Small 3.2 with the explicit reasoning capability (SFT + RL on Magistral Medium traces) of Magistral Small 1.2. The merged model retains the [THINK]/[/THINK] reasoning tokens from Magistral via tokenizer_source: union, so it can operate in either fast-response or deep-reasoning mode depending on system prompt.

Use

Works with vLLM, transformers, and llama.cpp (after GGUF conversion). Use Magistral's system prompt format to enable reasoning traces; use a standard Mistral system prompt for fast chat.

Tokenizer

This repo ships Mistral's canonical tekken.json rather than a serialized HF tokenizer.json. transformers' AutoTokenizer.from_pretrained auto-converts it on load. For best fidelity in production, use mistral-common or vLLM, which read tekken directly. The [THINK] / [/THINK] reasoning tokens are preserved (ranks 34 / 35).

Merge config

merge_method: slerp
base_model: mistralai/Mistral-Small-3.2-24B-Instruct-2506
slices:
  - sources:
      - model: mistralai/Mistral-Small-3.2-24B-Instruct-2506
        layer_range: [0, 40]
      - model: mistralai/Magistral-Small-2509
        layer_range: [0, 40]
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
embed_slerp: true
dtype: bfloat16
tokenizer_source: union

Part of the Tripplet Taipei model series.