Qwen3-0.6B-lk-alpha-14b-paired-MNN

Experimental Qwen3-0.6B draft model for TokForge + MNN, trained as a more target-paired draft for Qwen3-14B style use.

Why this repo exists

Most mobile draft work ends up optimized around 8B. This repo exists for the opposite question:

what happens if a very small draft is trained more explicitly toward a 14B target lane?

Training snapshot

Final logged training acceptance (alpha):

0.7236

Usage

This bundle is meant for TokForge / MNN, not standard HF Inference.

Typical TokForge recipe:

{
  "backend_type": "cpu",
  "thread_num": 4,
  "precision": "low",
  "memory": "low",
  "sampler_type": "greedy",
  "speculative_type": "draftmodel",
  "draft_predict_length": 3,
  "draft_config_path": "/path/to/config_cpu.json"
}

Status

This is a research / targeted pairing artifact:

intended for 14B-leaning experiments
more specialized than the general 20K 8B draft lane
not yet the default recommendation over the simpler general-purpose drafts

Limitations and Intended Use

This is a target-paired experiment, not the current default draft recommendation.
Best treated as a research option for 14B-leaning mobile tests.
General-purpose users should usually start with the 20K baseline draft first.