merged_model

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the Model Breadcrumbs with TIES merge method using Qwen/Qwen3-0.6B as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: Qwen/Qwen3-0.6B  # Base
  - model: suayptalha/Qwen3-0.6B-Code-Expert
    parameters:
      weight: 0.3  # General coding
  - model: suayptalha/Qwen3-0.6B-Math-Expert
    parameters:
      weight: 0.3  # Math specialist
  - model: Redhanuman/Shadow-0.7B
    parameters:
      weight: 0.35  # **BUMPED UP** - reasoning/CoT powerhouse
  - model: yarin-shaked/Qwen3-Codeforces-GRPO
    parameters:
      weight: 0.25  # Competitive programming

merge_method: breadcrumbs_ties
base_model: Qwen/Qwen3-0.6B

parameters:
  density: 0.6          # TIES trim
  beta: 0.1             # Breadcrumbs outliers  
  alpha: 0.1            # Breadcrumbs negligible
  t_ie: false           # Skip TIES norm (specialists)
  normalize: true       # L2 normalize deltas
  int8_mask: true

dtype: bfloat16
random_seed: 0

tokenizer:
  source: union         # Safe vocab merge
Downloads last month
2
Safetensors
Model size
0.8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Xiaojian9992024/Qwen3-Yatagarasu-0.6B

Paper for Xiaojian9992024/Qwen3-Yatagarasu-0.6B