Replace placeholder index.html with TIDE org card content
Browse files- index.html +145 -17
index.html
CHANGED
|
@@ -1,19 +1,147 @@
|
|
| 1 |
<!doctype html>
|
| 2 |
-
<html>
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
</html>
|
|
|
|
| 1 |
<!doctype html>
|
| 2 |
+
<html lang="en">
|
| 3 |
+
<head>
|
| 4 |
+
<meta charset="utf-8" />
|
| 5 |
+
<meta name="viewport" content="width=device-width,initial-scale=1" />
|
| 6 |
+
<title>TIDE-dllm β Turning the TIDE</title>
|
| 7 |
+
<style>
|
| 8 |
+
:root {
|
| 9 |
+
--tide-navy: #003D5B;
|
| 10 |
+
--tide-cyan: #00B4D8;
|
| 11 |
+
--bg: #ffffff;
|
| 12 |
+
--fg: #1f2328;
|
| 13 |
+
--muted: #6e7781;
|
| 14 |
+
--border: #d0d7de;
|
| 15 |
+
--row-stripe: #f6f8fa;
|
| 16 |
+
--code-bg: #f6f8fa;
|
| 17 |
+
}
|
| 18 |
+
@media (prefers-color-scheme: dark) {
|
| 19 |
+
:root {
|
| 20 |
+
--bg: #0d1117;
|
| 21 |
+
--fg: #e6edf3;
|
| 22 |
+
--muted: #8b949e;
|
| 23 |
+
--border: #30363d;
|
| 24 |
+
--row-stripe: #161b22;
|
| 25 |
+
--code-bg: #161b22;
|
| 26 |
+
}
|
| 27 |
+
}
|
| 28 |
+
html, body { margin: 0; padding: 0; background: var(--bg); color: var(--fg); }
|
| 29 |
+
body {
|
| 30 |
+
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Helvetica Neue", Arial, "PingFang SC", sans-serif;
|
| 31 |
+
line-height: 1.55;
|
| 32 |
+
max-width: 880px;
|
| 33 |
+
margin: 0 auto;
|
| 34 |
+
padding: 2rem 1.25rem 3rem;
|
| 35 |
+
}
|
| 36 |
+
.logo { text-align: center; margin: 0.5rem 0 1rem; }
|
| 37 |
+
.logo img { max-width: 320px; width: 60%; height: auto; }
|
| 38 |
+
h1 { text-align: center; font-size: 1.6rem; line-height: 1.3; color: var(--tide-navy); margin: 0.25rem 0 0.5rem; }
|
| 39 |
+
@media (prefers-color-scheme: dark) { h1 { color: var(--tide-cyan); } }
|
| 40 |
+
.tagline { text-align: center; color: var(--muted); margin: 0 0 1rem; font-size: 0.95rem; }
|
| 41 |
+
.badges { text-align: center; margin: 0.5rem 0 1.5rem; }
|
| 42 |
+
.badges a {
|
| 43 |
+
display: inline-block;
|
| 44 |
+
padding: 0.35em 0.9em;
|
| 45 |
+
margin: 0.2em;
|
| 46 |
+
border-radius: 999px;
|
| 47 |
+
background: var(--tide-navy);
|
| 48 |
+
color: #fff;
|
| 49 |
+
text-decoration: none;
|
| 50 |
+
font-size: 0.88rem;
|
| 51 |
+
transition: background 0.15s;
|
| 52 |
+
}
|
| 53 |
+
.badges a:hover { background: var(--tide-cyan); }
|
| 54 |
+
h2 { color: var(--tide-navy); border-bottom: 1px solid var(--border); padding-bottom: 0.3em; margin-top: 2rem; font-size: 1.25rem; }
|
| 55 |
+
@media (prefers-color-scheme: dark) { h2 { color: var(--tide-cyan); } }
|
| 56 |
+
a { color: var(--tide-navy); }
|
| 57 |
+
@media (prefers-color-scheme: dark) { a { color: var(--tide-cyan); } }
|
| 58 |
+
table { border-collapse: collapse; width: 100%; font-size: 0.92rem; margin: 0.5rem 0; }
|
| 59 |
+
th, td { padding: 0.45em 0.6em; text-align: left; border-bottom: 1px solid var(--border); }
|
| 60 |
+
th { background: var(--row-stripe); font-weight: 600; }
|
| 61 |
+
tr:nth-child(even) td { background: var(--row-stripe); }
|
| 62 |
+
code, pre { font-family: ui-monospace, SFMono-Regular, "SF Mono", Menlo, Consolas, monospace; }
|
| 63 |
+
code { background: var(--code-bg); padding: 0.1em 0.35em; border-radius: 4px; font-size: 0.88em; }
|
| 64 |
+
pre { background: var(--code-bg); padding: 0.85em 1em; border-radius: 6px; overflow-x: auto; font-size: 0.85rem; }
|
| 65 |
+
pre code { background: none; padding: 0; }
|
| 66 |
+
ul.highlights { margin: 0.5em 0; padding-left: 1.25rem; }
|
| 67 |
+
ul.highlights li { margin-bottom: 0.4em; }
|
| 68 |
+
hr { border: none; border-top: 1px solid var(--border); margin: 1.75rem 0; }
|
| 69 |
+
</style>
|
| 70 |
+
</head>
|
| 71 |
+
<body>
|
| 72 |
+
|
| 73 |
+
<div class="logo"><img src="logo.gif" alt="TIDE logo" /></div>
|
| 74 |
+
|
| 75 |
+
<h1>Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models</h1>
|
| 76 |
+
|
| 77 |
+
<p class="tagline">π The first cross-architecture distillation framework for diffusion LLMs β distilling 8B dense and 16B MoE teachers into a 0.6B student π</p>
|
| 78 |
+
|
| 79 |
+
<div class="badges">
|
| 80 |
+
<a href="https://arxiv.org/abs/2604.26951" target="_blank">π arXiv 2604.26951</a>
|
| 81 |
+
<a href="https://github.com/PKU-YuanGroup/TIDE" target="_blank">π» Code</a>
|
| 82 |
+
<a href="https://pku-yuangroup.github.io/TIDE-Page/" target="_blank">π Project page</a>
|
| 83 |
+
</div>
|
| 84 |
+
|
| 85 |
+
<p>This organization hosts the <strong>distilled student checkpoints</strong> and <strong>pre-tokenized SFT datasets</strong> released with TIDE. The framework consists of three modular components β <strong>TIDAL</strong> (dual-axis interpolation), <strong>CompDemo</strong> (complementary mask-split teacher inference), and <strong>Reverse CALM</strong> (cross-tokenizer chunk-level matching) β and is evaluated across two heterogeneous distillation pipelines.</p>
|
| 86 |
+
|
| 87 |
+
<h2>β¨ Highlights</h2>
|
| 88 |
+
<ul class="highlights">
|
| 89 |
+
<li><strong>+1.53 average gain</strong> over the non-distilled BD3LM baseline across 8 benchmarks (34.20 vs. 32.67).</li>
|
| 90 |
+
<li><strong>+16.48 on HumanEval</strong> over the equivalent-size AR baseline (48.78 vs. 32.30) β distilled dLLMs especially excel at code generation.</li>
|
| 91 |
+
<li><strong>22Γ peak-memory reduction</strong> vs. the 16B MoE LLaDA2 teacher (1.4 GB vs. 31.3 GB) and <strong>5.2Γ faster inference</strong> (6.25 s vs. 32.55 s for 256 tokens on H100).</li>
|
| 92 |
+
</ul>
|
| 93 |
+
|
| 94 |
+
<h2>π€ Released models</h2>
|
| 95 |
+
<p>Six 0.6B distilled student checkpoints (3 per pipeline). Each is initialized from <a href="https://huggingface.co/dllm-hub/Qwen3-0.6B-diffusion-bd3lm-v0.1"><code>dllm-hub/Qwen3-0.6B-diffusion-bd3lm-v0.1</code></a> and distilled from a larger dLLM teacher.</p>
|
| 96 |
+
|
| 97 |
+
<table>
|
| 98 |
+
<thead><tr><th>Pipeline</th><th>Variant</th><th>Repo</th></tr></thead>
|
| 99 |
+
<tbody>
|
| 100 |
+
<tr><td>A β Cross-Tokenizer (LLaDA2 teacher)</td><td><strong>TIDE-Cross</strong> (native, paper-best)</td><td><a href="https://huggingface.co/TIDE-dllm/distill-LLaDA2-TIDE_Cross">distill-LLaDA2-TIDE_Cross</a></td></tr>
|
| 101 |
+
<tr><td>A β Cross-Tokenizer (LLaDA2 teacher)</td><td>TIDE-Shared variant</td><td><a href="https://huggingface.co/TIDE-dllm/distill-LLaDA2-TIDE_Shared">distill-LLaDA2-TIDE_Shared</a></td></tr>
|
| 102 |
+
<tr><td>A β Cross-Tokenizer (LLaDA2 teacher)</td><td>CALM baseline</td><td><a href="https://huggingface.co/TIDE-dllm/distill-LLaDA2-CALM">distill-LLaDA2-CALM</a></td></tr>
|
| 103 |
+
<tr><td>B β Shared-Tokenizer (WeDLM teacher)</td><td><strong>TIDE-Shared</strong> (native, paper-best)</td><td><a href="https://huggingface.co/TIDE-dllm/distill-WeDLM-TIDE_Shared">distill-WeDLM-TIDE_Shared</a></td></tr>
|
| 104 |
+
<tr><td>B β Shared-Tokenizer (WeDLM teacher)</td><td>TIDE-Cross variant</td><td><a href="https://huggingface.co/TIDE-dllm/distill-WeDLM-TIDE_Cross">distill-WeDLM-TIDE_Cross</a></td></tr>
|
| 105 |
+
<tr><td>B β Shared-Tokenizer (WeDLM teacher)</td><td>KL baseline</td><td><a href="https://huggingface.co/TIDE-dllm/distill-WeDLM-KL">distill-WeDLM-KL</a></td></tr>
|
| 106 |
+
</tbody>
|
| 107 |
+
</table>
|
| 108 |
+
|
| 109 |
+
<h2>π Released datasets</h2>
|
| 110 |
+
<p>Pre-tokenized SFT mixtures (<code>tulu-3-sft-mixture</code> + <code>smoltalk</code> + <code>opc-sft-stage1</code> + <code>opc-sft-stage2</code>) prepared for each teacher, so distillation jobs never re-tokenize at startup.</p>
|
| 111 |
+
|
| 112 |
+
<table>
|
| 113 |
+
<thead><tr><th>Pipeline</th><th>Repo</th></tr></thead>
|
| 114 |
+
<tbody>
|
| 115 |
+
<tr><td>A β for the LLaDA2 teacher</td><td><a href="https://huggingface.co/datasets/TIDE-dllm/distill_llada2_sft">distill_llada2_sft</a></td></tr>
|
| 116 |
+
<tr><td>B β for the WeDLM teacher</td><td><a href="https://huggingface.co/datasets/TIDE-dllm/distill_wedlm_sft">distill_wedlm_sft</a></td></tr>
|
| 117 |
+
</tbody>
|
| 118 |
+
</table>
|
| 119 |
+
|
| 120 |
+
<h2>π Quick start</h2>
|
| 121 |
+
<pre><code>import torch
|
| 122 |
+
from transformers import AutoModelForMaskedLM, AutoTokenizer
|
| 123 |
+
|
| 124 |
+
repo = "TIDE-dllm/distill-LLaDA2-TIDE_Cross" # paper-best Pipeline-A checkpoint
|
| 125 |
+
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 126 |
+
|
| 127 |
+
model = AutoModelForMaskedLM.from_pretrained(
|
| 128 |
+
repo, dtype=torch.bfloat16, trust_remote_code=True,
|
| 129 |
+
).to(device).eval()
|
| 130 |
+
tokenizer = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
|
| 131 |
+
</code></pre>
|
| 132 |
+
|
| 133 |
+
<p>The same <code>generate()</code> routine published with <a href="https://huggingface.co/dllm-hub/Qwen3-0.6B-diffusion-bd3lm-v0.1"><code>dllm-hub/Qwen3-0.6B-diffusion-bd3lm-v0.1</code></a> works on every TIDE checkpoint β just swap the model name.</p>
|
| 134 |
+
|
| 135 |
+
<h2>π Citation</h2>
|
| 136 |
+
<pre><code>@misc{zhang2026turningtidecrossarchitecturedistillation,
|
| 137 |
+
title={Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models},
|
| 138 |
+
author={Gongbo Zhang and Wen Wang and Ye Tian and Li Yuan},
|
| 139 |
+
year={2026},
|
| 140 |
+
eprint={2604.26951},
|
| 141 |
+
archivePrefix={arXiv},
|
| 142 |
+
primaryClass={cs.CL},
|
| 143 |
+
url={https://arxiv.org/abs/2604.26951},
|
| 144 |
+
}</code></pre>
|
| 145 |
+
|
| 146 |
+
</body>
|
| 147 |
</html>
|