N2048M commited on
Commit
7ce3008
Β·
verified Β·
1 Parent(s): 906ebd4

Replace placeholder index.html with TIDE org card content

Browse files
Files changed (1) hide show
  1. index.html +145 -17
index.html CHANGED
@@ -1,19 +1,147 @@
1
  <!doctype html>
2
- <html>
3
- <head>
4
- <meta charset="utf-8" />
5
- <meta name="viewport" content="width=device-width" />
6
- <title>My static Space</title>
7
- <link rel="stylesheet" href="style.css" />
8
- </head>
9
- <body>
10
- <div class="card">
11
- <h1>Welcome to your static Space!</h1>
12
- <p>You can modify this app directly by editing <i>index.html</i> in the Files and versions tab.</p>
13
- <p>
14
- Also don't forget to check the
15
- <a href="https://huggingface.co/docs/hub/spaces" target="_blank">Spaces documentation</a>.
16
- </p>
17
- </div>
18
- </body>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  </html>
 
1
  <!doctype html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="utf-8" />
5
+ <meta name="viewport" content="width=device-width,initial-scale=1" />
6
+ <title>TIDE-dllm β€” Turning the TIDE</title>
7
+ <style>
8
+ :root {
9
+ --tide-navy: #003D5B;
10
+ --tide-cyan: #00B4D8;
11
+ --bg: #ffffff;
12
+ --fg: #1f2328;
13
+ --muted: #6e7781;
14
+ --border: #d0d7de;
15
+ --row-stripe: #f6f8fa;
16
+ --code-bg: #f6f8fa;
17
+ }
18
+ @media (prefers-color-scheme: dark) {
19
+ :root {
20
+ --bg: #0d1117;
21
+ --fg: #e6edf3;
22
+ --muted: #8b949e;
23
+ --border: #30363d;
24
+ --row-stripe: #161b22;
25
+ --code-bg: #161b22;
26
+ }
27
+ }
28
+ html, body { margin: 0; padding: 0; background: var(--bg); color: var(--fg); }
29
+ body {
30
+ font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Helvetica Neue", Arial, "PingFang SC", sans-serif;
31
+ line-height: 1.55;
32
+ max-width: 880px;
33
+ margin: 0 auto;
34
+ padding: 2rem 1.25rem 3rem;
35
+ }
36
+ .logo { text-align: center; margin: 0.5rem 0 1rem; }
37
+ .logo img { max-width: 320px; width: 60%; height: auto; }
38
+ h1 { text-align: center; font-size: 1.6rem; line-height: 1.3; color: var(--tide-navy); margin: 0.25rem 0 0.5rem; }
39
+ @media (prefers-color-scheme: dark) { h1 { color: var(--tide-cyan); } }
40
+ .tagline { text-align: center; color: var(--muted); margin: 0 0 1rem; font-size: 0.95rem; }
41
+ .badges { text-align: center; margin: 0.5rem 0 1.5rem; }
42
+ .badges a {
43
+ display: inline-block;
44
+ padding: 0.35em 0.9em;
45
+ margin: 0.2em;
46
+ border-radius: 999px;
47
+ background: var(--tide-navy);
48
+ color: #fff;
49
+ text-decoration: none;
50
+ font-size: 0.88rem;
51
+ transition: background 0.15s;
52
+ }
53
+ .badges a:hover { background: var(--tide-cyan); }
54
+ h2 { color: var(--tide-navy); border-bottom: 1px solid var(--border); padding-bottom: 0.3em; margin-top: 2rem; font-size: 1.25rem; }
55
+ @media (prefers-color-scheme: dark) { h2 { color: var(--tide-cyan); } }
56
+ a { color: var(--tide-navy); }
57
+ @media (prefers-color-scheme: dark) { a { color: var(--tide-cyan); } }
58
+ table { border-collapse: collapse; width: 100%; font-size: 0.92rem; margin: 0.5rem 0; }
59
+ th, td { padding: 0.45em 0.6em; text-align: left; border-bottom: 1px solid var(--border); }
60
+ th { background: var(--row-stripe); font-weight: 600; }
61
+ tr:nth-child(even) td { background: var(--row-stripe); }
62
+ code, pre { font-family: ui-monospace, SFMono-Regular, "SF Mono", Menlo, Consolas, monospace; }
63
+ code { background: var(--code-bg); padding: 0.1em 0.35em; border-radius: 4px; font-size: 0.88em; }
64
+ pre { background: var(--code-bg); padding: 0.85em 1em; border-radius: 6px; overflow-x: auto; font-size: 0.85rem; }
65
+ pre code { background: none; padding: 0; }
66
+ ul.highlights { margin: 0.5em 0; padding-left: 1.25rem; }
67
+ ul.highlights li { margin-bottom: 0.4em; }
68
+ hr { border: none; border-top: 1px solid var(--border); margin: 1.75rem 0; }
69
+ </style>
70
+ </head>
71
+ <body>
72
+
73
+ <div class="logo"><img src="logo.gif" alt="TIDE logo" /></div>
74
+
75
+ <h1>Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models</h1>
76
+
77
+ <p class="tagline">🌊 The first cross-architecture distillation framework for diffusion LLMs β€” distilling 8B dense and 16B MoE teachers into a 0.6B student 🌊</p>
78
+
79
+ <div class="badges">
80
+ <a href="https://arxiv.org/abs/2604.26951" target="_blank">πŸ“„ arXiv 2604.26951</a>
81
+ <a href="https://github.com/PKU-YuanGroup/TIDE" target="_blank">πŸ’» Code</a>
82
+ <a href="https://pku-yuangroup.github.io/TIDE-Page/" target="_blank">🌐 Project page</a>
83
+ </div>
84
+
85
+ <p>This organization hosts the <strong>distilled student checkpoints</strong> and <strong>pre-tokenized SFT datasets</strong> released with TIDE. The framework consists of three modular components β€” <strong>TIDAL</strong> (dual-axis interpolation), <strong>CompDemo</strong> (complementary mask-split teacher inference), and <strong>Reverse CALM</strong> (cross-tokenizer chunk-level matching) β€” and is evaluated across two heterogeneous distillation pipelines.</p>
86
+
87
+ <h2>✨ Highlights</h2>
88
+ <ul class="highlights">
89
+ <li><strong>+1.53 average gain</strong> over the non-distilled BD3LM baseline across 8 benchmarks (34.20 vs. 32.67).</li>
90
+ <li><strong>+16.48 on HumanEval</strong> over the equivalent-size AR baseline (48.78 vs. 32.30) β€” distilled dLLMs especially excel at code generation.</li>
91
+ <li><strong>22Γ— peak-memory reduction</strong> vs. the 16B MoE LLaDA2 teacher (1.4 GB vs. 31.3 GB) and <strong>5.2Γ— faster inference</strong> (6.25 s vs. 32.55 s for 256 tokens on H100).</li>
92
+ </ul>
93
+
94
+ <h2>πŸ€– Released models</h2>
95
+ <p>Six 0.6B distilled student checkpoints (3 per pipeline). Each is initialized from <a href="https://huggingface.co/dllm-hub/Qwen3-0.6B-diffusion-bd3lm-v0.1"><code>dllm-hub/Qwen3-0.6B-diffusion-bd3lm-v0.1</code></a> and distilled from a larger dLLM teacher.</p>
96
+
97
+ <table>
98
+ <thead><tr><th>Pipeline</th><th>Variant</th><th>Repo</th></tr></thead>
99
+ <tbody>
100
+ <tr><td>A β€” Cross-Tokenizer (LLaDA2 teacher)</td><td><strong>TIDE-Cross</strong> (native, paper-best)</td><td><a href="https://huggingface.co/TIDE-dllm/distill-LLaDA2-TIDE_Cross">distill-LLaDA2-TIDE_Cross</a></td></tr>
101
+ <tr><td>A β€” Cross-Tokenizer (LLaDA2 teacher)</td><td>TIDE-Shared variant</td><td><a href="https://huggingface.co/TIDE-dllm/distill-LLaDA2-TIDE_Shared">distill-LLaDA2-TIDE_Shared</a></td></tr>
102
+ <tr><td>A β€” Cross-Tokenizer (LLaDA2 teacher)</td><td>CALM baseline</td><td><a href="https://huggingface.co/TIDE-dllm/distill-LLaDA2-CALM">distill-LLaDA2-CALM</a></td></tr>
103
+ <tr><td>B β€” Shared-Tokenizer (WeDLM teacher)</td><td><strong>TIDE-Shared</strong> (native, paper-best)</td><td><a href="https://huggingface.co/TIDE-dllm/distill-WeDLM-TIDE_Shared">distill-WeDLM-TIDE_Shared</a></td></tr>
104
+ <tr><td>B β€” Shared-Tokenizer (WeDLM teacher)</td><td>TIDE-Cross variant</td><td><a href="https://huggingface.co/TIDE-dllm/distill-WeDLM-TIDE_Cross">distill-WeDLM-TIDE_Cross</a></td></tr>
105
+ <tr><td>B β€” Shared-Tokenizer (WeDLM teacher)</td><td>KL baseline</td><td><a href="https://huggingface.co/TIDE-dllm/distill-WeDLM-KL">distill-WeDLM-KL</a></td></tr>
106
+ </tbody>
107
+ </table>
108
+
109
+ <h2>πŸ“š Released datasets</h2>
110
+ <p>Pre-tokenized SFT mixtures (<code>tulu-3-sft-mixture</code> + <code>smoltalk</code> + <code>opc-sft-stage1</code> + <code>opc-sft-stage2</code>) prepared for each teacher, so distillation jobs never re-tokenize at startup.</p>
111
+
112
+ <table>
113
+ <thead><tr><th>Pipeline</th><th>Repo</th></tr></thead>
114
+ <tbody>
115
+ <tr><td>A β€” for the LLaDA2 teacher</td><td><a href="https://huggingface.co/datasets/TIDE-dllm/distill_llada2_sft">distill_llada2_sft</a></td></tr>
116
+ <tr><td>B β€” for the WeDLM teacher</td><td><a href="https://huggingface.co/datasets/TIDE-dllm/distill_wedlm_sft">distill_wedlm_sft</a></td></tr>
117
+ </tbody>
118
+ </table>
119
+
120
+ <h2>πŸš€ Quick start</h2>
121
+ <pre><code>import torch
122
+ from transformers import AutoModelForMaskedLM, AutoTokenizer
123
+
124
+ repo = "TIDE-dllm/distill-LLaDA2-TIDE_Cross" # paper-best Pipeline-A checkpoint
125
+ device = "cuda" if torch.cuda.is_available() else "cpu"
126
+
127
+ model = AutoModelForMaskedLM.from_pretrained(
128
+ repo, dtype=torch.bfloat16, trust_remote_code=True,
129
+ ).to(device).eval()
130
+ tokenizer = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
131
+ </code></pre>
132
+
133
+ <p>The same <code>generate()</code> routine published with <a href="https://huggingface.co/dllm-hub/Qwen3-0.6B-diffusion-bd3lm-v0.1"><code>dllm-hub/Qwen3-0.6B-diffusion-bd3lm-v0.1</code></a> works on every TIDE checkpoint β€” just swap the model name.</p>
134
+
135
+ <h2>πŸ“ Citation</h2>
136
+ <pre><code>@misc{zhang2026turningtidecrossarchitecturedistillation,
137
+ title={Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models},
138
+ author={Gongbo Zhang and Wen Wang and Ye Tian and Li Yuan},
139
+ year={2026},
140
+ eprint={2604.26951},
141
+ archivePrefix={arXiv},
142
+ primaryClass={cs.CL},
143
+ url={https://arxiv.org/abs/2604.26951},
144
+ }</code></pre>
145
+
146
+ </body>
147
  </html>