Darwin-27B-KR — Korean Hybrid Vigor through Evolutionary FFN Breeding

Qwen3.5-27B Dense | 27B Params | Thinking Mode | 262K Context | 201 Languages | BF16 | Apache 2.0
The child outperforms both parents on Korean cultural intelligence — Hybrid Vigor confirmed at 27B scale

What Is This?

Darwin-27B-KR is a second-generation Darwin model bred from two complementary parents:

Father (Darwin-27B-Opus): Qwen3.5-27B evolved with Claude 4.6 Opus reasoning FFN — strong in logical reasoning and deep inference
Mother (Qwen3.5-27B-KoSFT): Qwen3.5-27B fine-tuned with 230K+ Korean language samples — strong in Korean cultural knowledge and linguistic understanding (private, purpose-bred for Korean knowledge reinforcement)

The Darwin V6 engine automatically discovered that 93.3% of FFN layers should come from the Mother, while preserving 93.2% of the Father's Attention layers — confirming the core Darwin principle: FFN carries knowledge, Attention carries reasoning.

The result: the child outperforms both parents on every Korean benchmark category, a phenomenon known as Hybrid Vigor (잡종강세).

Hybrid Vigor: 4-Generation CLIcK Comparison

CLIcK (Cultural and Linguistic Intelligence in Korean) — 200 questions, 0-shot, loglikelihood evaluation.

Generation	Model	CLIcK (Overall)	Culture	Language
Gen 0 (Ancestor)	Qwen3.5-27B	69.52%	71.84%	64.66%
Gen 1 (Father)	Darwin-27B-Opus	70.19%	72.91%	64.47%
— (Mother)	Qwen3.5-27B-KoSFT	74.74%	76.95%	70.11%
Gen 2 (Child)	Darwin-27B-KR	75.59% ★	77.85% ★	70.86% ★

The child surpasses both parents. Two generations of zero-training evolution achieved +6.07%p over the original Qwen3.5-27B.

Detailed Category Breakdown

Category	Ancestor	Father	Mother	Child	Best
Economy	93.22%	93.22%	94.92%	94.92%	Mother=Child
Geography	70.23%	70.23%	75.57%	75.57%	Mother=Child
History	47.00%	47.00%	50.50%	53.50%	Child ★
K-pop	92.68%	97.56%	90.24%	92.68%	Father
Law	59.50%	60.00%	67.50%	69.50%	Child ★
Politics	80.95%	82.14%	86.90%	85.71%	Mother
Society	87.00%	89.00%	90.50%	90.00%	Mother
Tradition	81.50%	82.50%	88.00%	88.50%	Child ★
Functional	68.18%	67.42%	71.21%	75.00%	Child ★
Grammar	44.50%	44.50%	55.00%	53.00%	Mother
Text	82.50%	82.50%	84.50%	86.00%	Child ★

Child wins 7 out of 11 categories. The largest gains are in Law (+9.5%p over Father), Functional Language (+7.6%p), and History (+6.5%p).

Why This Matters

1. Hybrid Vigor at 27B Scale

Previously demonstrated at 4B (Darwin-4B-Genesis, CLIcK 92%). Now confirmed at 27B: the child exceeds both parents on Korean cultural and linguistic intelligence with zero additional training.

2. CMA-ES Discovered the Optimal Breeding Strategy

The evolutionary optimizer automatically determined:

FFN ratio: 93.3% → Almost entirely Mother's Korean knowledge
Attention ratio: 6.8% → Almost entirely Father's reasoning chains
This independently confirms our finding: "FFN = knowledge (safe to swap), Attention = reasoning (must preserve)"

3. Ancestral Knowledge Tracking

By evaluating all four generations (Ancestor → Father → Mother → Child), we can trace how knowledge flows through evolutionary breeding:

Father inherits Claude's reasoning but loses some Korean knowledge
Mother gains Korean knowledge through SFT
Child combines both — inheriting the best of each lineage

4. Zero Training Cost

	This Model	Typical Fine-Tuning
GPU	H100 × 1	8-64 GPUs
Time	~2.5 hours	Days to weeks
Training data	0 tokens	Millions of tokens
Training compute	Fitness evaluation only	Full gradient updates

How It Works: Evolutionary FFN Breeding

Father: Darwin-27B-Opus (Claude reasoning FFN)
Mother: Qwen3.5-27B-KoSFT (Korean knowledge FFN)
Both:   hidden_size=4096, intermediate=17408, 64 layers
        = 100% structurally compatible

Method: CMA-ES optimizes per-block breeding ratios
        across 14 genome dimensions
Fitness: kmmlu_lite (Korean knowledge benchmark)
Result: Child inherits Mother's Korean FFN knowledge
        while preserving Father's reasoning Attention

Optimal Genome (Discovered by CMA-ES)

global_ratio:    0.4812    Overall 48:52 Father:Mother balance
attn_ratio:      0.0681    Attention 93.2% from Father (reasoning preserved!)
ffn_ratio:       0.9334    FFN 93.3% from Mother (Korean knowledge absorbed!)
embed_ratio:     0.3678    Embedding 63:37 Father:Mother
density_a:       0.9699    Father density (DARE sparsity)
density_b:       0.9767    Mother density (DARE sparsity)
mri_trust:       0.5333    MRI guidance weight

Block-Level Ratios

Block 0 (L0-10):   0.6041    Mother-leaning (early layers)
Block 1 (L11-21):  0.4107    Balanced
Block 2 (L22-32):  0.3975    Father-leaning (core reasoning)
Block 3 (L33-43):  0.6078    Mother-leaning (knowledge layers)
Block 4 (L44-54):  0.7820    Strong Mother (Korean knowledge peak)
Block 5 (L55-64):  0.3960    Father-leaning (output reasoning)

Key insight: CMA-ES applied the strongest Mother influence to Block 4 (L44-54), which corresponds to deep knowledge layers, while preserving Father's reasoning in Blocks 2 and 5.

Evolution Parameters

Setting	Value
Engine	Darwin V6 (Diagnostic-Guided Evolutionary Merge)
Merge method	DARE-TIES (direct PyTorch, no mergekit dependency)
Population size	16
Phase 1 (proxy search)	150 steps
Phase 2 (real merge)	25 steps, top 5 elite
Fitness function	kmmlu_lite (Korean knowledge)
Best fitness	0.8274 (82.74%)
MRI guidance	Enabled (static + probe analysis)
Total time	~2.5 hours (H100 ×1)

Family Tree

Qwen/Qwen3.5-27B (Ancestor, CLIcK 69.52%)
├── × Jackrong/Claude-4.6-Opus-Reasoning-Distilled
│   └── Darwin-27B-Opus (Father, Gen 1, CLIcK 70.19%)
│       │   + Claude reasoning FFN
│       │   + GPQA Diamond 74.7% greedy
│       │
│       └── × Qwen3.5-27B-KoSFT (Mother, CLIcK 74.74%)
│           │   + 230K Korean SFT samples
│           │   + K-AI Leaderboard caliber
│           │
│           └── ★ Darwin-27B-KR (Child, Gen 2, CLIcK 75.59%)
│                 Hybrid Vigor: surpasses BOTH parents!
│                 FFN 93.3% Mother + Attention 93.2% Father

DNA Composition

Qwen3.5-27B (foundation)              ~40%
Claude 4.6 Opus (reasoning patterns)  ~5% (via Father's Attention)
Korean SFT (cultural knowledge)       ~55% (via Mother's FFN)

Model Specifications


Architecture	Qwen3.5 Dense (GatedDeltaNet)
Parameters	27B
Hidden Size	4096
Intermediate Size	17408
Layers	64
Context Length	262,144 (extensible to 1M via YaRN)
Precision	BF16
Languages	201
Thinking	Enabled (chain-of-thought reasoning)
License	Apache 2.0

Usage

Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained(
    "FINAL-Bench/Darwin-27B-KR", trust_remote_code=True
)
model = AutoModelForCausalLM.from_pretrained(
    "FINAL-Bench/Darwin-27B-KR",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

messages = [{"role": "user", "content": "한국의 전통 혼례 절차에 대해 설명해주세요."}]
text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=4096, do_sample=False)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

VRAM Requirements

Setup	VRAM	Status
BF16 Full Precision	~55 GB	H100 single GPU
NVIDIA H100 80GB	80 GB	Very comfortable
2× RTX 4090 48GB	48 GB	Tensor parallel
4-bit Quantized	~16 GB	RTX 4090 single GPU

Darwin 27B Family

Model	Gen	Role	CLIcK	GPQA	Specialty
Qwen3.5-27B	Gen 0	Ancestor	69.52%	85.5%	Foundation
Darwin-27B-Opus	Gen 1	Father	70.19%	74.7%*	Claude reasoning
Qwen3.5-27B-KoSFT	—	Mother	74.74%	—	Korean knowledge
Darwin-27B-KR	Gen 2	Child	75.59% ★	—	Hybrid: Reasoning + Korean

*GPQA evaluated with greedy decoding; maj@8 retry in progress (estimated 88.9%)

Key Findings

FFN = Knowledge, Attention = Reasoning — CMA-ES independently discovered this by assigning 93.3% FFN from Mother (Korean) and 93.2% Attention from Father (reasoning)
Hybrid Vigor scales with model size — Confirmed at 4B (Genesis, CLIcK 92%) and now at 27B (KR, CLIcK 75.59%)
Zero-training evolution works recursively — Gen 0 → Gen 1 → Gen 2, each generation improving, with zero gradient updates
Ancestral knowledge is preserved — Despite two generations of breeding, core Qwen3.5-27B capabilities remain intact
Korean knowledge transfers through FFN — The Mother's 230K Korean SFT knowledge was successfully transplanted into the child via FFN breeding

Roadmap

Full GPQA Diamond evaluation (greedy + selective maj@8 retry)
K-AI Leaderboard official submission (KMMLU-Pro, CLIcK, HLE, MuSR, Com2)
MMLU-Pro evaluation and HF leaderboard registration
Cross-architecture breeding at 27B scale (Transformer × Mamba FFN)
Third-generation breeding with domain-specific mothers

References

DARE-TIES: Yadav et al., 2023 (https://arxiv.org/abs/2311.03099) — re-implemented, not library-dependent
CLIcK: Kim et al., 2024 (https://arxiv.org/abs/2403.06412) — Cultural and Linguistic Intelligence in Korean
Darwin V6 Engine: https://huggingface.co/spaces/ginigen-ai/DARWIN-V5-BACKUP
FINAL Bench: https://huggingface.co/spaces/FINAL-Bench/Leaderboard
Darwin Family Collection: https://huggingface.co/collections/FINAL-Bench/darwin-family

Built By


Developer	VIDRAFT
Engine	Darwin V6 (Diagnostic-Guided Evolutionary Merge)
Generation	Generation 2 — Korean Hybrid Vigor
Architecture	Qwen3.5-27B Dense
License	Apache 2.0

Citation

@misc{vidraft_darwin_27b_kr_2026,
  title        = {Darwin-27B-KR: Korean Hybrid Vigor through Evolutionary FFN Breeding},
  subtitle     = {Child Surpasses Both Parents on Korean Cultural Intelligence with Zero Training},
  author       = {VIDRAFT},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/FINAL-Bench/Darwin-27B-KR}}
}

Downloads last month: -

Model tree for FINAL-Bench/Darwin-27B-KR

Base model

Qwen/Qwen3.5-27B

Finetuned

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

Finetuned

FINAL-Bench/Darwin-27B-Opus

Finetuned

(2)

this model

Papers for FINAL-Bench/Darwin-27B-KR

CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean

Paper • 2403.06412 • Published Mar 11, 2024 • 3

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

Paper • 2311.03099 • Published Nov 6, 2023 • 31