Spaces:

karlexmarin
/

taf-agent

Running

karlexmarin Claude Opus 4.7 (1M context) commited on 7 days ago

Commit

f09cd1d

1 Parent(s): fbec820

v0.7.1: Chat-template Sniffer (anti-bullshit #2) + 9th mode

Ships anti-bullshit pack #2: detect when an evaluation framework will silently halve accuracy because the chat template wasn't applied. lm-eval-harness issue #1841: running via vLLM-served auto-applies chat_template, but local hf/vllm mode does not — multi-turn evals drop ~50% with no warning.

NEW
- 📜 Chat-template mode: paste an HF model id (or raw tokenizer_config.json) → 1-second classification into a known family (Llama-3 / ChatML / Mistral / Gemma / Phi-3 / Alpaca / DeepSeek / custom / none) + exact CLI flags for lm-eval / vLLM / transformers.
- js/chat_template_sniffer.js: pure logic module (codes + params, no human strings). Fetches /raw/main/tokenizer_config.json from HF Hub, parses chat_template field, matches distinctive markers per family.
- Verdicts: ok (known family) · custom (template present, unrecognized) · missing (no chat_template — base model) · base_model · unknown.
- Per-framework command output: lm_eval --apply_chat_template, vllm serve --chat-template <name>, tokenizer.apply_chat_template().

VIRTUAL SIMULATION
- 7 HF fixtures classify correctly: Llama-3, Qwen/ChatML, Mistral, Gemma, Phi-3, DeepSeek (full-width unicode), Alpaca.
- Edge cases: base model → missing; custom unknown format → custom.
- Live HF fetch tested against 3 real models (Mistral-7B-Instruct-v0.3, Qwen2.5-7B-Instruct, Phi-3-mini-4k-instruct) — all classify correctly.

i18n
- 33 new template.* keys × 4 langs (modes.template, mode_desc.template, all warnings/verdicts/labels/status messages).
- modes.tip updated 8 → 9 modes in 4 langs.
- 456 keys × 4 langs, 0 missing / 0 extra (parity verified).

42/42 smoke tests passed locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (5) hide show

index.html +29 -0
js/chat_template_sniffer.js +124 -0
js/i18n.js +144 -4
js/main.js +159 -1
style.css +27 -0

index.html CHANGED Viewed

@@ -335,6 +335,7 @@
         <button class="mode-btn" data-mode="diagnose" role="tab" aria-selected="false" data-i18n="modes.diagnose">🩺 Diagnose CLI</button>
         <button class="mode-btn" data-mode="phase" role="tab" aria-selected="false" data-i18n="modes.phase">📊 Phase diagram</button>
         <button class="mode-btn" data-mode="unmask" role="tab" aria-selected="false" data-i18n="modes.unmask">🪟 Unmask</button>
       </div>
       <p id="mode-desc" class="recipe-desc" data-i18n="modes.desc">
         <strong>Quickest start</strong>: paste any HuggingFace model id (e.g. <code>meta-llama/Meta-Llama-3-8B</code>),
@@ -652,6 +653,34 @@
       <div id="unmask-output" style="margin-top: 1em;"></div>
     </section>
     <!-- Recipe selector (mode=recipe) -->
     <section id="recipe-section" style="display:none;">
       <h2 data-i18n="recipe.title">📋 Recipe</h2>

         <button class="mode-btn" data-mode="diagnose" role="tab" aria-selected="false" data-i18n="modes.diagnose">🩺 Diagnose CLI</button>
         <button class="mode-btn" data-mode="phase" role="tab" aria-selected="false" data-i18n="modes.phase">📊 Phase diagram</button>
         <button class="mode-btn" data-mode="unmask" role="tab" aria-selected="false" data-i18n="modes.unmask">🪟 Unmask</button>
+        <button class="mode-btn" data-mode="template" role="tab" aria-selected="false" data-i18n="modes.template">📜 Chat-template</button>
       </div>
       <p id="mode-desc" class="recipe-desc" data-i18n="modes.desc">
         <strong>Quickest start</strong>: paste any HuggingFace model id (e.g. <code>meta-llama/Meta-Llama-3-8B</code>),
       <div id="unmask-output" style="margin-top: 1em;"></div>
     </section>
+    <!-- Chat-template sniffer mode (v0.7.1 anti-bullshit pack #2) -->
+    <section id="template-section" style="display:none;">
+      <h2><span data-i18n="template.title">📜 Chat-template Sniffer</span>
+        <span class="info"><span class="tooltip" data-i18n="template.tip">
+          Paste an HF model id (or raw tokenizer_config.json). Detects the
+          chat-template family (Llama-3, ChatML, Mistral, Gemma, Phi-3,
+          Alpaca, DeepSeek, custom) and gives you the exact framework command
+          to use it correctly. lm-eval-harness silently halves accuracy if you
+          forget to apply it (issue #1841).
+        </span></span>
+      </h2>
+      <p class="recipe-desc" data-i18n="template.desc">
+        <strong>Did you forget <code>--apply_chat_template</code>?</strong> Most multi-turn evals fail by ~50% because the chat template wasn't applied. Paste a model id, get the exact CLI flag for your stack.
+      </p>
+      <div class="form-row">
+        <label for="template-id" data-i18n="template.id_label">HF model id:</label>
+        <input type="text" id="template-id" placeholder="e.g. mistralai/Mistral-7B-Instruct-v0.3" />
+        <button type="button" id="template-fetch-btn" data-i18n="template.fetch_btn">📜 Sniff</button>
+      </div>
+      <p id="template-status" class="recipe-desc" style="font-size:0.92em;"></p>
+      <details style="margin: 0.6em 0;">
+        <summary style="cursor:pointer; font-size:0.92em;" data-i18n="template.paste_summary">Or paste raw tokenizer_config.json (private models)</summary>
+        <textarea id="template-paste" rows="6" style="width:100%; font-family:monospace; font-size:0.85em; margin-top:0.4em;" placeholder='{"chat_template": "...", ...}'></textarea>
+        <button type="button" id="template-paste-btn" data-i18n="template.paste_btn" style="margin-top:0.4em;">📜 Sniff pasted config</button>
+      </details>
+      <div id="template-output" style="margin-top: 1em;"></div>
+    </section>
     <!-- Recipe selector (mode=recipe) -->
     <section id="recipe-section" style="display:none;">
       <h2 data-i18n="recipe.title">📋 Recipe</h2>

js/chat_template_sniffer.js ADDED Viewed

	@@ -0,0 +1,124 @@

+// Chat-template sniffer (v0.7.1 anti-bullshit pack #2)
+// Parses tokenizer_config.json and detects which chat-template family the
+// model uses. Pure logic — no human-readable strings. main.js renders via i18n.
+//
+// Why this matters: lm-eval-harness applied via vLLM-served API auto-applies
+// the chat_template; local `hf`/`vllm` mode does NOT. This silently halves
+// accuracy on multi-turn evals. Issue #1841 in lm-evaluation-harness.
+// Distinctive markers per family. Order matters: more specific first.
+const FAMILIES = [
+  {
+    id: "llama-3",
+    label: "Llama-3 instruct",
+    // begin_of_text uses bos_token variable in real templates, not literal —
+    // these two are the reliable signature.
+    markers: ["<|start_header_id|>", "<|eot_id|>"],
+    chatTemplateName: "llama-3",
+    vllmTemplate: "examples/template_llama_3.jinja",
+  },
+  {
+    id: "chatml",
+    label: "ChatML (Qwen, OpenAI-style)",
+    markers: ["<|im_start|>", "<|im_end|>"],
+    chatTemplateName: "chatml",
+    vllmTemplate: "examples/template_chatml.jinja",
+  },
+  {
+    id: "mistral",
+    label: "Mistral instruct",
+    markers: ["[INST]", "[/INST]"],
+    chatTemplateName: "mistral",
+    vllmTemplate: "examples/template_mistral.jinja",
+  },
+  {
+    id: "gemma",
+    label: "Gemma",
+    markers: ["<start_of_turn>", "<end_of_turn>"],
+    chatTemplateName: "gemma",
+    vllmTemplate: "examples/template_gemma.jinja",
+  },
+  {
+    id: "phi-3",
+    label: "Phi-3",
+    markers: ["<|user|>", "<|assistant|>", "<|end|>"],
+    chatTemplateName: "phi-3",
+    vllmTemplate: "examples/template_phi3.jinja",
+  },
+  {
+    id: "deepseek",
+    label: "DeepSeek",
+    // DeepSeek uses full-width unicode bars (U+FF5C). Check the codepoint
+    // explicitly so source files staying ASCII-safe still match.
+    markers: ["｜User｜", "｜Assistant｜"],
+    chatTemplateName: "deepseek",
+    vllmTemplate: null,
+  },
+  {
+    id: "alpaca",
+    label: "Alpaca",
+    markers: ["### Instruction:", "### Response:"],
+    chatTemplateName: "alpaca",
+    vllmTemplate: null,
+  },
+];
+export function sniffChatTemplate(tokenizerConfig) {
+  const out = {
+    hasChatTemplate: false,
+    rawTemplate: null,
+    rawTemplateLength: 0,
+    detectedFamily: null,
+    detectedLabel: null,
+    chatTemplateName: null,
+    vllmTemplate: null,
+    addGenerationPromptDetected: false,
+    matchedMarkers: [],
+    verdict: "unknown",   // ok | custom | missing | base_model | unknown
+    warnings: [],         // each: { code, params }
+  };
+  const tpl = tokenizerConfig?.chat_template;
+  if (typeof tpl === "string" && tpl.length > 0) {
+    out.hasChatTemplate = true;
+    out.rawTemplate = tpl.length > 600 ? tpl.slice(0, 600) + "…" : tpl;
+    out.rawTemplateLength = tpl.length;
+    out.addGenerationPromptDetected = /add_generation_prompt/.test(tpl);
+    // Try each family in order. Match if ALL markers are present in the template.
+    for (const fam of FAMILIES) {
+      const hits = fam.markers.filter(m => tpl.includes(m));
+      if (hits.length === fam.markers.length) {
+        out.detectedFamily = fam.id;
+        out.detectedLabel = fam.label;
+        out.chatTemplateName = fam.chatTemplateName;
+        out.vllmTemplate = fam.vllmTemplate;
+        out.matchedMarkers = hits;
+        out.verdict = "ok";
+        break;
+      }
+    }
+    if (!out.detectedFamily) {
+      out.detectedFamily = "custom";
+      out.detectedLabel = null;
+      out.verdict = "custom";
+      out.warnings.push({ code: "custom_template", params: { length: out.rawTemplateLength } });
+    }
+  } else {
+    // No chat_template at all — typical for base / pretrained-only models.
+    // Could still be a legitimate base model, so verdict depends on caller intent.
+    out.verdict = "missing";
+    out.warnings.push({ code: "no_chat_template", params: {} });
+  }
+  // Universal warning: lm-eval-harness silent halving.
+  if (out.hasChatTemplate) {
+    out.warnings.push({ code: "lm_eval_apply", params: {} });
+  }
+  // vLLM warning if template requires explicit --chat-template flag
+  if (out.hasChatTemplate && out.detectedFamily !== "alpaca" && out.detectedFamily !== "deepseek") {
+    out.warnings.push({ code: "vllm_apply", params: { name: out.chatTemplateName ?? "auto" } });
+  }
+  return out;
+}

js/i18n.js CHANGED Viewed

@@ -189,6 +189,41 @@ export const TRANSLATIONS = {
     "mode_desc.phase":             "γ × θ scatter of the paper's empirical panel. Hover a dot for details, click to load into Diagnose / Recipe forms.",
     "mode_desc.unmask":            "Detects whether max_position_embeddings is misleading (SWA / YaRN / RoPE-scaling). Paste a model id, get a 1-line verdict.",
     "profile.preset_loaded":       "✅ Loaded preset for <strong>{id}</strong>. Form pre-filled. (Click 📥 Fetch to override with the latest config from HF Hub.)",
     "share.import_desc":       "Got a JSON file from someone else's TAF analysis? Load it here to see the verdict + chain locally. Same view as if you'd run it yourself.",
     "share.import_btn":        "📂 Load shared JSON",
     "synthesis.system":        "You are a precise transformer LLM diagnostic assistant. Given pre-computed TAF formula results, write a clear plain-English summary in 4-6 sentences. Cite the section number (§X.Y) for each number you mention. Always give a concrete recommendation. Do NOT invent numbers.",
@@ -281,7 +316,7 @@ export const TRANSLATIONS = {
     "common.no":           "No",
     // Mode tooltips
-    "modes.tip":           "<strong>Eight ways to use the tool</strong>.<br><strong>📇 Profile</strong>: paste a model id → 5-recipe TAF Card.<br><strong>🆚 Compare</strong>: 2-3 models side-by-side on one recipe.<br><strong>🔍 Inspect config</strong>: paste raw config.json → full Profile.<br><strong>💬 Ask</strong>: free-form question, browser LLM picks the recipe.<br><strong>📋 Recipe</strong>: manual selection with full form control.<br><strong>🩺 Diagnose CLI</strong>: generate Python command for local γ measurement.<br><strong>📊 Phase diagram</strong>: 23-model panel on (log θ, γ) plane.<br><strong>🪟 Unmask</strong>: detect misleading max_position_embeddings (SWA / YaRN / RoPE-scaling).",
     "profile.tip":         "<strong>One-click full diagnosis</strong>. Paste any HF model id (or pick preset). Tool runs all 5 recipes (long-context, KV-compression, custom-vs-API, budget, hardware) and produces a single <strong>TAF Card</strong> with verdict per dimension + key numbers + architecture classification.<br><br><strong>Use case</strong>: \"I'm evaluating Qwen2.5-32B for production — what's its full viability profile?\" → paste id → Profile → done.",
     "compare.tip":         "<strong>Same recipe, multiple models</strong>. Pick 2-3 candidate models and one recipe. See verdicts in a single comparison table.<br><br><strong>Use case</strong>: \"I need long-context retrieval at 16K — which is best: Llama-3-8B, Mistral-7B, or Qwen-7B?\" → pick 3 + X-2 + 16K → see winner.",
@@ -802,6 +837,41 @@ export const TRANSLATIONS = {
     "mode_desc.phase":             "Scatter γ × θ del panel empírico del paper. Hover sobre puntos para detalles, click para cargar en Diagnose / Recipe.",
     "mode_desc.unmask":            "Detecta si max_position_embeddings es engañoso (SWA / YaRN / RoPE-scaling). Pega un model id, obtén un veredicto en 1 línea.",
     "profile.preset_loaded":       "✅ Preset cargado para <strong>{id}</strong>. Formulario pre-rellenado. (Click 📥 Fetch para sobreescribir con el último config de HF Hub.)",
     "share.import_desc":       "¿Tienes un fichero JSON del análisis TAF de alguien? Cárgalo aquí para ver el veredicto + cadena localmente. La misma vista que si lo hubieras ejecutado tú.",
     "share.import_btn":        "📂 Cargar JSON compartido",
     "synthesis.system":        "Eres un asistente de diagnóstico preciso para LLMs transformer. Dados resultados de fórmulas TAF pre-calculados, escribe un resumen claro en español de 4-6 frases. Cita el número de sección (§X.Y) para cada número que menciones. Da siempre una recomendación concreta. NO inventes números.",
@@ -894,7 +964,7 @@ export const TRANSLATIONS = {
     "common.no":           "No",
     // Tooltips de modos
-    "modes.tip":           "<strong>Ocho formas de usar la herramienta</strong>.<br><strong>📇 Perfil</strong>: pega un id → TAF Card de 5 recetas.<br><strong>🆚 Comparar</strong>: 2-3 modelos lado a lado en una receta.<br><strong>🔍 Inspeccionar config</strong>: pega config.json crudo → Perfil completo.<br><strong>💬 Pregunta</strong>: pregunta libre, el LLM del navegador elige la receta.<br><strong>📋 Receta</strong>: selección manual con control total del formulario.<br><strong>🩺 Diagnóstico CLI</strong>: genera comando Python para medir γ localmente.<br><strong>📊 Diagrama de fase</strong>: panel de 23 modelos en plano (log θ, γ).<br><strong>🪟 Desenmascarar</strong>: detecta max_position_embeddings engañoso (SWA / YaRN / RoPE-scaling).",
     "profile.tip":         "<strong>Diagnóstico completo en un click</strong>. Pega cualquier id de modelo HF (o elige preset). La herramienta ejecuta las 5 recetas (contexto largo, compresión KV, custom vs API, presupuesto, hardware) y produce una única <strong>TAF Card</strong> con veredicto por dimensión + números clave + clasificación arquitectónica.<br><br><strong>Caso de uso</strong>: \"Estoy evaluando Qwen2.5-32B para producción — ¿cuál es su perfil completo de viabilidad?\" → pega id → Perfilar → listo.",
     "compare.tip":         "<strong>Misma receta, múltiples modelos</strong>. Elige 2-3 modelos candidatos y una receta. Ve los veredictos en una única tabla comparativa.<br><br><strong>Caso de uso</strong>: \"Necesito recuperación de contexto largo a 16K — ¿cuál es mejor: Llama-3-8B, Mistral-7B o Qwen-7B?\" → elige 3 + X-2 + 16K → ve el ganador.",
@@ -1279,6 +1349,41 @@ export const TRANSLATIONS = {
     "mode_desc.phase":             "Scatter γ × θ du panel empirique du papier. Survolez les points pour détails, cliquez pour charger dans Diagnose / Recipe.",
     "mode_desc.unmask":            "Détecte si max_position_embeddings est trompeur (SWA / YaRN / RoPE-scaling). Collez un model id, obtenez un verdict en 1 ligne.",
     "profile.preset_loaded":       "✅ Préréglage chargé pour <strong>{id}</strong>. Formulaire pré-rempli. (Cliquez 📥 Fetch pour écraser avec le dernier config depuis HF Hub.)",
     "share.import_desc":       "Vous avez un fichier JSON de l'analyse TAF de quelqu'un ? Chargez-le ici pour voir le verdict + la chaîne localement. La même vue que si vous l'aviez exécuté vous-même.",
     "share.import_btn":        "📂 Charger JSON partagé",
     "synthesis.system":        "Vous êtes un assistant de diagnostic précis pour LLMs transformer. Étant donné des résultats de formules TAF pré-calculés, écrivez un résumé clair en français de 4-6 phrases. Citez le numéro de section (§X.Y) pour chaque nombre mentionné. Donnez toujours une recommandation concrète. N'INVENTEZ PAS de nombres.",
@@ -1371,7 +1476,7 @@ export const TRANSLATIONS = {
     "common.no":           "Non",
     // Tooltips des modes
-    "modes.tip":           "<strong>Huit façons d'utiliser l'outil</strong>.<br><strong>📇 Profil</strong>: collez un id → TAF Card avec 5 recettes.<br><strong>🆚 Comparer</strong>: 2-3 modèles côte à côte sur une recette.<br><strong>🔍 Inspecter config</strong>: collez config.json brut → Profil complet.<br><strong>💬 Question</strong>: question libre, le LLM du navigateur choisit la recette.<br><strong>📋 Recette</strong>: sélection manuelle avec contrôle total du formulaire.<br><strong>🩺 Diagnostic CLI</strong>: génère commande Python pour mesurer γ localement.<br><strong>📊 Diagramme de phase</strong>: panel de 23 modèles dans le plan (log θ, γ).<br><strong>🪟 Démasquer</strong>: détecte un max_position_embeddings trompeur (SWA / YaRN / RoPE-scaling).",
     "profile.tip":         "<strong>Diagnostic complet en un clic</strong>. Collez n'importe quel id de modèle HF (ou choisissez préréglage). L'outil exécute les 5 recettes (contexte long, compression KV, custom vs API, budget, hardware) et produit une <strong>TAF Card</strong> unique avec verdict par dimension + nombres clés + classification architecturale.<br><br><strong>Cas d'usage</strong>: « J'évalue Qwen2.5-32B pour la production — quel est son profil complet de viabilité ? » → collez id → Profiler → fait.",
     "compare.tip":         "<strong>Même recette, plusieurs modèles</strong>. Choisissez 2-3 modèles candidats et une recette. Voyez les verdicts dans un seul tableau comparatif.<br><br><strong>Cas d'usage</strong>: « J'ai besoin de récupération longue contexte à 16K — quel est le meilleur : Llama-3-8B, Mistral-7B ou Qwen-7B ? » → choisissez 3 + X-2 + 16K → voyez le gagnant.",
@@ -1756,6 +1861,41 @@ export const TRANSLATIONS = {
     "mode_desc.phase":             "论文经验面板的 γ × θ 散点图。悬停点查看详情，点击加载到 Diagnose / Recipe 表单。",
     "mode_desc.unmask":            "检测 max_position_embeddings 是否误导（SWA / YaRN / RoPE 缩放）。粘贴 model id，1 行判定。",
     "profile.preset_loaded":       "✅ 已为 <strong>{id}</strong> 加载预设。表单已预填。（点击 📥 Fetch 用 HF Hub 最新 config 覆盖。）",
     "share.import_desc":       "有他人 TAF 分析的 JSON 文件? 在这里加载以本地查看判定 + 链。与您自己运行的视图相同。",
     "share.import_btn":        "📂 加载共享的 JSON",
     "synthesis.system":        "您是 transformer LLM 的精确诊断助手。给定预先计算的 TAF 公式结果,用 4-6 句中文写出清晰的摘要。为每个提到的数字引用章节号 (§X.Y)。始终给出具体建议。不要编造数字。",
@@ -1848,7 +1988,7 @@ export const TRANSLATIONS = {
     "common.no":           "否",
     // 模式提示
-    "modes.tip":           "<strong>八种使用方式</strong>。<br><strong>📇 画像</strong>: 粘贴模型 id → 5 个配方的 TAF 卡。<br><strong>🆚 比较</strong>: 2-3 个模型在一个配方上并排比较。<br><strong>🔍 检查 config</strong>: 粘贴原始 config.json → 完整画像。<br><strong>💬 提问</strong>: 自由形式问题,浏览器 LLM 选择配方。<br><strong>📋 配方</strong>: 手动选择,完全控制表单。<br><strong>🩺 CLI 诊断</strong>: 生成 Python 命令在本地测量 γ。<br><strong>📊 相图</strong>: 23 个面板模型在 (log θ, γ) 平面上。<br><strong>🪟 揭示</strong>: 检测误导的 max_position_embeddings（SWA / YaRN / RoPE 缩放）。",
     "profile.tip":         "<strong>一键完整诊断</strong>。粘贴任意 HF 模型 id (或选择预设)。工具运行所有 5 个配方 (长上下文、KV 压缩、自定义 vs API、预算、硬件),生成单个 <strong>TAF 卡</strong>,显示每个维度的判定 + 关键数字 + 架构分类。<br><br><strong>用例</strong>: \"我正在为生产评估 Qwen2.5-32B — 它的完整可行性概况是什么?\" → 粘贴 id → 画像 → 完成。",
     "compare.tip":         "<strong>同一配方,多个模型</strong>。选择 2-3 个候选模型和一个配方。在单个比较表中查看判定。<br><br><strong>用例</strong>: \"我需要在 16K 进行长上下文检索 — 哪个最好: Llama-3-8B、Mistral-7B 或 Qwen-7B?\" → 选择 3 个 + X-2 + 16K → 看赢家。",

     "mode_desc.phase":             "γ × θ scatter of the paper's empirical panel. Hover a dot for details, click to load into Diagnose / Recipe forms.",
     "mode_desc.unmask":            "Detects whether max_position_embeddings is misleading (SWA / YaRN / RoPE-scaling). Paste a model id, get a 1-line verdict.",
     "profile.preset_loaded":       "✅ Loaded preset for <strong>{id}</strong>. Form pre-filled. (Click 📥 Fetch to override with the latest config from HF Hub.)",
+    // v0.7.1 — anti-bullshit pack #2: Chat-template Sniffer
+    "modes.template":              "📜 Chat-template",
+    "mode_desc.template":          "Detects which chat-template family a model uses (Llama-3 / ChatML / Mistral / Gemma / Phi-3 / Alpaca / DeepSeek). Gives the exact CLI flag for lm-eval / vLLM / transformers.",
+    "template.title":              "📜 Chat-template Sniffer",
+    "template.tip":                "Paste an HF model id (or raw tokenizer_config.json). Detects the chat-template family and gives you the exact framework command to use it correctly. lm-eval-harness silently halves accuracy if you forget to apply it (issue #1841).",
+    "template.desc":               "<strong>Did you forget <code>--apply_chat_template</code>?</strong> Most multi-turn evals fail by ~50% because the chat template wasn't applied. Paste a model id, get the exact CLI flag for your stack.",
+    "template.id_label":           "HF model id:",
+    "template.fetch_btn":          "📜 Sniff",
+    "template.paste_summary":      "Or paste raw tokenizer_config.json (private models)",
+    "template.paste_btn":          "📜 Sniff pasted config",
+    "template.label.family":       "Detected family",
+    "template.label.markers":      "Matched markers",
+    "template.label.tpl_len":      "Template length",
+    "template.section.warnings":   "Warnings",
+    "template.section.commands":   "Commands by framework",
+    "template.section.raw":        "Raw template (preview)",
+    "template.family.custom":      "custom (unknown family)",
+    "template.family.none":        "(no chat_template)",
+    "template.verdict.ok":         "✅ TEMPLATE DETECTED",
+    "template.verdict.custom":     "⚠ CUSTOM TEMPLATE",
+    "template.verdict.missing":    "❌ NO CHAT TEMPLATE",
+    "template.verdict.base_model": "ℹ BASE MODEL (no chat)",
+    "template.verdict.unknown":    "❓ UNKNOWN",
+    "template.warn.no_chat_template": "No <code>chat_template</code> field in tokenizer_config.json. This is typical for base / pretrained-only models. If you intended an instruct-tuned model, the wrong file may be loaded.",
+    "template.warn.custom_template":  "Template is non-standard ({length} chars). The tool could not match it against known families. Inspect the raw preview below and verify your eval framework supports it.",
+    "template.warn.lm_eval_apply":    "<strong>lm-eval-harness:</strong> add <code>--apply_chat_template</code> or your accuracy will silently drop ~50% on multi-turn evals (issue #1841).",
+    "template.warn.vllm_apply":       "<strong>vLLM serve:</strong> verify <code>--chat-template</code> is set (auto-detection sometimes fails for fine-tuned variants). Suggested: <code>{name}</code>.",
+    "template.status.empty_id":    "⚠ Enter a model id (e.g. mistralai/Mistral-7B-Instruct-v0.3).",
+    "template.status.fetching":    "⏳ Fetching tokenizer_config.json for {modelId}...",
+    "template.status.success":     "✅ Sniffed {modelId} (verdict: {verdict})",
+    "template.status.empty_paste": "⚠ Paste a tokenizer_config.json first.",
+    "template.status.invalid_json":"❌ Not valid JSON: {error}",
+    "template.status.success_paste":"✅ Sniffed pasted config (verdict: {verdict})",
+    "template.pasted_label":       "(pasted tokenizer_config)",
     "share.import_desc":       "Got a JSON file from someone else's TAF analysis? Load it here to see the verdict + chain locally. Same view as if you'd run it yourself.",
     "share.import_btn":        "📂 Load shared JSON",
     "synthesis.system":        "You are a precise transformer LLM diagnostic assistant. Given pre-computed TAF formula results, write a clear plain-English summary in 4-6 sentences. Cite the section number (§X.Y) for each number you mention. Always give a concrete recommendation. Do NOT invent numbers.",
     "common.no":           "No",
     // Mode tooltips
+    "modes.tip":           "<strong>Nine ways to use the tool</strong>.<br><strong>📇 Profile</strong>: paste a model id → 5-recipe TAF Card.<br><strong>🆚 Compare</strong>: 2-3 models side-by-side on one recipe.<br><strong>🔍 Inspect config</strong>: paste raw config.json → full Profile.<br><strong>💬 Ask</strong>: free-form question, browser LLM picks the recipe.<br><strong>📋 Recipe</strong>: manual selection with full form control.<br><strong>🩺 Diagnose CLI</strong>: generate Python command for local γ measurement.<br><strong>📊 Phase diagram</strong>: 23-model panel on (log θ, γ) plane.<br><strong>🪟 Unmask</strong>: detect misleading max_position_embeddings (SWA / YaRN / RoPE-scaling).<br><strong>📜 Chat-template</strong>: detect family + give exact CLI flag for lm-eval / vLLM / transformers.",
     "profile.tip":         "<strong>One-click full diagnosis</strong>. Paste any HF model id (or pick preset). Tool runs all 5 recipes (long-context, KV-compression, custom-vs-API, budget, hardware) and produces a single <strong>TAF Card</strong> with verdict per dimension + key numbers + architecture classification.<br><br><strong>Use case</strong>: \"I'm evaluating Qwen2.5-32B for production — what's its full viability profile?\" → paste id → Profile → done.",
     "compare.tip":         "<strong>Same recipe, multiple models</strong>. Pick 2-3 candidate models and one recipe. See verdicts in a single comparison table.<br><br><strong>Use case</strong>: \"I need long-context retrieval at 16K — which is best: Llama-3-8B, Mistral-7B, or Qwen-7B?\" → pick 3 + X-2 + 16K → see winner.",
     "mode_desc.phase":             "Scatter γ × θ del panel empírico del paper. Hover sobre puntos para detalles, click para cargar en Diagnose / Recipe.",
     "mode_desc.unmask":            "Detecta si max_position_embeddings es engañoso (SWA / YaRN / RoPE-scaling). Pega un model id, obtén un veredicto en 1 línea.",
     "profile.preset_loaded":       "✅ Preset cargado para <strong>{id}</strong>. Formulario pre-rellenado. (Click 📥 Fetch para sobreescribir con el último config de HF Hub.)",
+    // v0.7.1 — anti-bullshit pack #2: Chat-template Sniffer
+    "modes.template":              "📜 Chat-template",
+    "mode_desc.template":          "Detecta qué familia de chat-template usa un modelo (Llama-3 / ChatML / Mistral / Gemma / Phi-3 / Alpaca / DeepSeek). Da el flag CLI exacto para lm-eval / vLLM / transformers.",
+    "template.title":              "📜 Detector de Chat-template",
+    "template.tip":                "Pega un model id de HF (o tokenizer_config.json crudo). Detecta la familia del chat-template y te da el comando exacto para usarlo bien. lm-eval-harness divide la accuracy entre 2 silenciosamente si te olvidas de aplicarlo (issue #1841).",
+    "template.desc":               "<strong>¿Olvidaste <code>--apply_chat_template</code>?</strong> La mayoría de evals multi-turn fallan ~50% porque el chat template no se aplicó. Pega un model id, obtén el flag CLI exacto para tu stack.",
+    "template.id_label":           "ID modelo HF:",
+    "template.fetch_btn":          "📜 Detectar",
+    "template.paste_summary":      "O pega tokenizer_config.json crudo (modelos privados)",
+    "template.paste_btn":          "📜 Detectar config pegado",
+    "template.label.family":       "Familia detectada",
+    "template.label.markers":      "Marcadores coincidentes",
+    "template.label.tpl_len":      "Longitud template",
+    "template.section.warnings":   "Avisos",
+    "template.section.commands":   "Comandos por framework",
+    "template.section.raw":        "Template crudo (preview)",
+    "template.family.custom":      "custom (familia desconocida)",
+    "template.family.none":        "(sin chat_template)",
+    "template.verdict.ok":         "✅ TEMPLATE DETECTADO",
+    "template.verdict.custom":     "⚠ TEMPLATE CUSTOM",
+    "template.verdict.missing":    "❌ SIN CHAT TEMPLATE",
+    "template.verdict.base_model": "ℹ MODELO BASE (sin chat)",
+    "template.verdict.unknown":    "❓ DESCONOCIDO",
+    "template.warn.no_chat_template": "Sin campo <code>chat_template</code> en tokenizer_config.json. Típico de modelos base / pretrained. Si esperabas un modelo instruct-tuned, puede que el archivo cargado sea incorrecto.",
+    "template.warn.custom_template":  "Template no estándar ({length} chars). La herramienta no lo encajó en familias conocidas. Revisa el preview y verifica que tu framework de eval lo soporta.",
+    "template.warn.lm_eval_apply":    "<strong>lm-eval-harness:</strong> añade <code>--apply_chat_template</code> o tu accuracy bajará ~50% silenciosamente en evals multi-turn (issue #1841).",
+    "template.warn.vllm_apply":       "<strong>vLLM serve:</strong> verifica que <code>--chat-template</code> esté puesto (la auto-detección a veces falla en variantes fine-tuned). Sugerido: <code>{name}</code>.",
+    "template.status.empty_id":    "⚠ Introduce un model id (ej. mistralai/Mistral-7B-Instruct-v0.3).",
+    "template.status.fetching":    "⏳ Obteniendo tokenizer_config.json para {modelId}...",
+    "template.status.success":     "✅ Detectado {modelId} (veredicto: {verdict})",
+    "template.status.empty_paste": "⚠ Pega un tokenizer_config.json primero.",
+    "template.status.invalid_json":"❌ JSON inválido: {error}",
+    "template.status.success_paste":"✅ Config pegado detectado (veredicto: {verdict})",
+    "template.pasted_label":       "(tokenizer_config pegado)",
     "share.import_desc":       "¿Tienes un fichero JSON del análisis TAF de alguien? Cárgalo aquí para ver el veredicto + cadena localmente. La misma vista que si lo hubieras ejecutado tú.",
     "share.import_btn":        "📂 Cargar JSON compartido",
     "synthesis.system":        "Eres un asistente de diagnóstico preciso para LLMs transformer. Dados resultados de fórmulas TAF pre-calculados, escribe un resumen claro en español de 4-6 frases. Cita el número de sección (§X.Y) para cada número que menciones. Da siempre una recomendación concreta. NO inventes números.",
     "common.no":           "No",
     // Tooltips de modos
+    "modes.tip":           "<strong>Nueve formas de usar la herramienta</strong>.<br><strong>📇 Perfil</strong>: pega un id → TAF Card de 5 recetas.<br><strong>🆚 Comparar</strong>: 2-3 modelos lado a lado en una receta.<br><strong>🔍 Inspeccionar config</strong>: pega config.json crudo → Perfil completo.<br><strong>💬 Pregunta</strong>: pregunta libre, el LLM del navegador elige la receta.<br><strong>📋 Receta</strong>: selección manual con control total del formulario.<br><strong>🩺 Diagnóstico CLI</strong>: genera comando Python para medir γ localmente.<br><strong>📊 Diagrama de fase</strong>: panel de 23 modelos en plano (log θ, γ).<br><strong>🪟 Desenmascarar</strong>: detecta max_position_embeddings engañoso (SWA / YaRN / RoPE-scaling).<br><strong>📜 Chat-template</strong>: detecta familia + da el flag CLI exacto para lm-eval / vLLM / transformers.",
     "profile.tip":         "<strong>Diagnóstico completo en un click</strong>. Pega cualquier id de modelo HF (o elige preset). La herramienta ejecuta las 5 recetas (contexto largo, compresión KV, custom vs API, presupuesto, hardware) y produce una única <strong>TAF Card</strong> con veredicto por dimensión + números clave + clasificación arquitectónica.<br><br><strong>Caso de uso</strong>: \"Estoy evaluando Qwen2.5-32B para producción — ¿cuál es su perfil completo de viabilidad?\" → pega id → Perfilar → listo.",
     "compare.tip":         "<strong>Misma receta, múltiples modelos</strong>. Elige 2-3 modelos candidatos y una receta. Ve los veredictos en una única tabla comparativa.<br><br><strong>Caso de uso</strong>: \"Necesito recuperación de contexto largo a 16K — ¿cuál es mejor: Llama-3-8B, Mistral-7B o Qwen-7B?\" → elige 3 + X-2 + 16K → ve el ganador.",
     "mode_desc.phase":             "Scatter γ × θ du panel empirique du papier. Survolez les points pour détails, cliquez pour charger dans Diagnose / Recipe.",
     "mode_desc.unmask":            "Détecte si max_position_embeddings est trompeur (SWA / YaRN / RoPE-scaling). Collez un model id, obtenez un verdict en 1 ligne.",
     "profile.preset_loaded":       "✅ Préréglage chargé pour <strong>{id}</strong>. Formulaire pré-rempli. (Cliquez 📥 Fetch pour écraser avec le dernier config depuis HF Hub.)",
+    // v0.7.1 — anti-bullshit pack #2: Chat-template Sniffer
+    "modes.template":              "📜 Chat-template",
+    "mode_desc.template":          "Détecte la famille de chat-template d'un modèle (Llama-3 / ChatML / Mistral / Gemma / Phi-3 / Alpaca / DeepSeek). Donne le flag CLI exact pour lm-eval / vLLM / transformers.",
+    "template.title":              "📜 Détecteur de Chat-template",
+    "template.tip":                "Collez un model id HF (ou tokenizer_config.json brut). Détecte la famille du chat-template et donne le commande exacte pour l'utiliser correctement. lm-eval-harness divise l'accuracy par 2 silencieusement si vous oubliez de l'appliquer (issue #1841).",
+    "template.desc":               "<strong>Avez-vous oublié <code>--apply_chat_template</code> ?</strong> La plupart des évals multi-tours échouent à ~50% parce que le chat template n'a pas été appliqué. Collez un model id, obtenez le flag CLI exact pour votre stack.",
+    "template.id_label":           "ID modèle HF :",
+    "template.fetch_btn":          "📜 Détecter",
+    "template.paste_summary":      "Ou collez tokenizer_config.json brut (modèles privés)",
+    "template.paste_btn":          "📜 Détecter config collé",
+    "template.label.family":       "Famille détectée",
+    "template.label.markers":      "Marqueurs correspondants",
+    "template.label.tpl_len":      "Longueur du template",
+    "template.section.warnings":   "Avertissements",
+    "template.section.commands":   "Commandes par framework",
+    "template.section.raw":        "Template brut (preview)",
+    "template.family.custom":      "custom (famille inconnue)",
+    "template.family.none":        "(pas de chat_template)",
+    "template.verdict.ok":         "✅ TEMPLATE DÉTECTÉ",
+    "template.verdict.custom":     "⚠ TEMPLATE CUSTOM",
+    "template.verdict.missing":    "❌ PAS DE CHAT TEMPLATE",
+    "template.verdict.base_model": "ℹ MODÈLE DE BASE (sans chat)",
+    "template.verdict.unknown":    "❓ INCONNU",
+    "template.warn.no_chat_template": "Pas de champ <code>chat_template</code> dans tokenizer_config.json. Typique des modèles base / pré-entraînés. Si vous attendiez un modèle instruct-tuned, le mauvais fichier peut être chargé.",
+    "template.warn.custom_template":  "Template non standard ({length} chars). L'outil n'a pas pu le faire correspondre aux familles connues. Inspectez le preview et vérifiez que votre framework d'éval le supporte.",
+    "template.warn.lm_eval_apply":    "<strong>lm-eval-harness :</strong> ajoutez <code>--apply_chat_template</code> ou votre accuracy chutera silencieusement de ~50% sur les évals multi-tours (issue #1841).",
+    "template.warn.vllm_apply":       "<strong>vLLM serve :</strong> vérifiez que <code>--chat-template</code> est défini (l'auto-détection échoue parfois sur les variantes fine-tunées). Suggéré : <code>{name}</code>.",
+    "template.status.empty_id":    "⚠ Saisissez un model id (ex. mistralai/Mistral-7B-Instruct-v0.3).",
+    "template.status.fetching":    "⏳ Récupération tokenizer_config.json pour {modelId}...",
+    "template.status.success":     "✅ {modelId} détecté (verdict : {verdict})",
+    "template.status.empty_paste": "⚠ Collez d'abord un tokenizer_config.json.",
+    "template.status.invalid_json":"❌ JSON invalide : {error}",
+    "template.status.success_paste":"✅ Config collé détecté (verdict : {verdict})",
+    "template.pasted_label":       "(tokenizer_config collé)",
     "share.import_desc":       "Vous avez un fichier JSON de l'analyse TAF de quelqu'un ? Chargez-le ici pour voir le verdict + la chaîne localement. La même vue que si vous l'aviez exécuté vous-même.",
     "share.import_btn":        "📂 Charger JSON partagé",
     "synthesis.system":        "Vous êtes un assistant de diagnostic précis pour LLMs transformer. Étant donné des résultats de formules TAF pré-calculés, écrivez un résumé clair en français de 4-6 phrases. Citez le numéro de section (§X.Y) pour chaque nombre mentionné. Donnez toujours une recommandation concrète. N'INVENTEZ PAS de nombres.",
     "common.no":           "Non",
     // Tooltips des modes
+    "modes.tip":           "<strong>Neuf façons d'utiliser l'outil</strong>.<br><strong>📇 Profil</strong>: collez un id → TAF Card avec 5 recettes.<br><strong>🆚 Comparer</strong>: 2-3 modèles côte à côte sur une recette.<br><strong>🔍 Inspecter config</strong>: collez config.json brut → Profil complet.<br><strong>💬 Question</strong>: question libre, le LLM du navigateur choisit la recette.<br><strong>📋 Recette</strong>: sélection manuelle avec contrôle total du formulaire.<br><strong>🩺 Diagnostic CLI</strong>: génère commande Python pour mesurer γ localement.<br><strong>📊 Diagramme de phase</strong>: panel de 23 modèles dans le plan (log θ, γ).<br><strong>🪟 Démasquer</strong>: détecte un max_position_embeddings trompeur (SWA / YaRN / RoPE-scaling).<br><strong>📜 Chat-template</strong>: détecte la famille + donne le flag CLI exact pour lm-eval / vLLM / transformers.",
     "profile.tip":         "<strong>Diagnostic complet en un clic</strong>. Collez n'importe quel id de modèle HF (ou choisissez préréglage). L'outil exécute les 5 recettes (contexte long, compression KV, custom vs API, budget, hardware) et produit une <strong>TAF Card</strong> unique avec verdict par dimension + nombres clés + classification architecturale.<br><br><strong>Cas d'usage</strong>: « J'évalue Qwen2.5-32B pour la production — quel est son profil complet de viabilité ? » → collez id → Profiler → fait.",
     "compare.tip":         "<strong>Même recette, plusieurs modèles</strong>. Choisissez 2-3 modèles candidats et une recette. Voyez les verdicts dans un seul tableau comparatif.<br><br><strong>Cas d'usage</strong>: « J'ai besoin de récupération longue contexte à 16K — quel est le meilleur : Llama-3-8B, Mistral-7B ou Qwen-7B ? » → choisissez 3 + X-2 + 16K → voyez le gagnant.",
     "mode_desc.phase":             "论文经验面板的 γ × θ 散点图。悬停点查看详情，点击加载到 Diagnose / Recipe 表单。",
     "mode_desc.unmask":            "检测 max_position_embeddings 是否误导（SWA / YaRN / RoPE 缩放）。粘贴 model id，1 行判定。",
     "profile.preset_loaded":       "✅ 已为 <strong>{id}</strong> 加载预设。表单已预填。（点击 📥 Fetch 用 HF Hub 最新 config 覆盖。）",
+    // v0.7.1 — anti-bullshit pack #2: Chat-template Sniffer
+    "modes.template":              "📜 Chat-template",
+    "mode_desc.template":          "检测模型使用的 chat-template 系列（Llama-3 / ChatML / Mistral / Gemma / Phi-3 / Alpaca / DeepSeek）。给出 lm-eval / vLLM / transformers 的精确 CLI flag。",
+    "template.title":              "📜 Chat-template 检测器",
+    "template.tip":                "粘贴 HF 模型 id（或原始 tokenizer_config.json）。检测 chat-template 系列并给出正确使用的精确框架命令。如果忘记应用，lm-eval-harness 会让 accuracy 静默对半（issue #1841）。",
+    "template.desc":               "<strong>忘了 <code>--apply_chat_template</code> 吗？</strong> 大多数 multi-turn eval 因为 chat template 未应用而失败 ~50%。粘贴 model id，获取你 stack 的精确 CLI flag。",
+    "template.id_label":           "HF 模型 id：",
+    "template.fetch_btn":          "📜 检测",
+    "template.paste_summary":      "或粘贴原始 tokenizer_config.json（私有模型）",
+    "template.paste_btn":          "📜 检测已粘贴 config",
+    "template.label.family":       "检测到的系列",
+    "template.label.markers":      "匹配的标记",
+    "template.label.tpl_len":      "Template 长度",
+    "template.section.warnings":   "警告",
+    "template.section.commands":   "各框架命令",
+    "template.section.raw":        "原始 template（预览）",
+    "template.family.custom":      "自定义（未知系列）",
+    "template.family.none":        "（无 chat_template）",
+    "template.verdict.ok":         "✅ 已检测到 TEMPLATE",
+    "template.verdict.custom":     "⚠ 自定义 TEMPLATE",
+    "template.verdict.missing":    "❌ 无 CHAT TEMPLATE",
+    "template.verdict.base_model": "ℹ 基础模型（无 chat）",
+    "template.verdict.unknown":    "❓ 未知",
+    "template.warn.no_chat_template": "tokenizer_config.json 中无 <code>chat_template</code> 字段。基础 / 仅预训练模型的典型情况。如果你期待 instruct-tuned 模型，可能加载了错误的文件。",
+    "template.warn.custom_template":  "非标准 template（{length} 字符）。工具无法将其匹配到已知系列。检查下方预览并验证你的 eval 框架是否支持。",
+    "template.warn.lm_eval_apply":    "<strong>lm-eval-harness：</strong>添加 <code>--apply_chat_template</code>，否则 multi-turn eval 上 accuracy 会静默下降 ~50%（issue #1841）。",
+    "template.warn.vllm_apply":       "<strong>vLLM serve：</strong>验证 <code>--chat-template</code> 已设置（fine-tuned 变体的自动检测有时失败）。建议：<code>{name}</code>。",
+    "template.status.empty_id":    "⚠ 输入 model id（例如 mistralai/Mistral-7B-Instruct-v0.3）。",
+    "template.status.fetching":    "⏳ 正在获取 {modelId} 的 tokenizer_config.json...",
+    "template.status.success":     "✅ 已检测 {modelId}（判定：{verdict}）",
+    "template.status.empty_paste": "⚠ 请先粘贴 tokenizer_config.json。",
+    "template.status.invalid_json":"❌ JSON 无效：{error}",
+    "template.status.success_paste":"✅ 已检测粘贴的 config（判定：{verdict}）",
+    "template.pasted_label":       "（已粘贴 tokenizer_config）",
     "share.import_desc":       "有他人 TAF 分析的 JSON 文件? 在这里加载以本地查看判定 + 链。与您自己运行的视图相同。",
     "share.import_btn":        "📂 加载共享的 JSON",
     "synthesis.system":        "您是 transformer LLM 的精确诊断助手。给定预先计算的 TAF 公式结果,用 4-6 句中文写出清晰的摘要。为每个提到的数字引用章节号 (§X.Y)。始终给出具体建议。不要编造数字。",
     "common.no":           "否",
     // 模式提示
+    "modes.tip":           "<strong>九种使用方式</strong>。<br><strong>📇 画像</strong>: 粘贴模型 id → 5 个配方的 TAF 卡。<br><strong>🆚 比较</strong>: 2-3 个模型在一个配方上并排比较。<br><strong>🔍 检查 config</strong>: 粘贴原始 config.json → 完整画像。<br><strong>💬 提问</strong>: 自由形式问题,浏览器 LLM 选择配方。<br><strong>📋 配方</strong>: 手动选择,完全控制表单。<br><strong>🩺 CLI 诊断</strong>: 生成 Python 命令在本地测量 γ。<br><strong>📊 相图</strong>: 23 个面板模型在 (log θ, γ) 平面上。<br><strong>🪟 揭示</strong>: 检测误导的 max_position_embeddings（SWA / YaRN / RoPE 缩放）。<br><strong>📜 Chat-template</strong>: 检测系列 + 给出 lm-eval / vLLM / transformers 的精确 CLI flag。",
     "profile.tip":         "<strong>一键完整诊断</strong>。粘贴任意 HF 模型 id (或选择预设)。工具运行所有 5 个配方 (长上下文、KV 压缩、自定义 vs API、预算、硬件),生成单个 <strong>TAF 卡</strong>,显示每个维度的判定 + 关键数字 + 架构分类。<br><br><strong>用例</strong>: \"我正在为生产评估 Qwen2.5-32B — 它的完整可行性概况是什么?\" → 粘贴 id → 画像 → 完成。",
     "compare.tip":         "<strong>同一配方,多个模型</strong>。选择 2-3 个候选模型和一个配方。在单个比较表中查看判定。<br><br><strong>用例</strong>: \"我需要在 16K 进行长上下文检索 — 哪个最好: Llama-3-8B、Mistral-7B 或 Qwen-7B?\" → 选择 3 个 + X-2 + 16K → 看赢家。",

js/main.js CHANGED Viewed

@@ -12,6 +12,7 @@ import { initPhaseDiagram } from "./phase_diagram.js";
 import { gammaCheckAll, REGIME_META } from "./gamma_check.js";
 import { loadLeanManifest, badgeHtml, badgesForUiBinding, renderTheoremTable, getManifest } from "./lean_badges.js";
 import { unmaskConfig } from "./swa_unmasker.js";
 const TAF_BROWSER_URL = "python/taf_browser.py";
 const ENABLE_WEBLLM = true;
@@ -186,7 +187,8 @@ document.querySelectorAll(".mode-btn").forEach(btn => {
     // Hide all mode sections
     ["ask-section", "recipe-section", "form-section",
      "profile-section", "compare-section", "inspector-section",
-     "diagnose-section", "phase-section", "unmask-section"].forEach(id => {
       const el = $(id);
       if (el) el.style.display = "none";
     });
@@ -195,6 +197,7 @@ document.querySelectorAll(".mode-btn").forEach(btn => {
       ask: "ask-section", recipe: "recipe-section", profile: "profile-section",
       compare: "compare-section", inspector: "inspector-section",
       diagnose: "diagnose-section", phase: "phase-section", unmask: "unmask-section",
     };
     const sectionId = sectionMap[mode];
     if (sectionId) $(sectionId).style.display = "";
@@ -581,6 +584,161 @@ $("unmask-id")?.addEventListener("keydown", (e) => {
   if (e.key === "Enter") { e.preventDefault(); runUnmaskFromId(); }
 });
 function configToPreset(cfg, modelId) {
   const n_attn = cfg.num_attention_heads || cfg.n_head || 0;
   const n_kv = cfg.num_key_value_heads || cfg.num_attention_heads || cfg.n_head || 0;

 import { gammaCheckAll, REGIME_META } from "./gamma_check.js";
 import { loadLeanManifest, badgeHtml, badgesForUiBinding, renderTheoremTable, getManifest } from "./lean_badges.js";
 import { unmaskConfig } from "./swa_unmasker.js";
+import { sniffChatTemplate } from "./chat_template_sniffer.js";
 const TAF_BROWSER_URL = "python/taf_browser.py";
 const ENABLE_WEBLLM = true;
     // Hide all mode sections
     ["ask-section", "recipe-section", "form-section",
      "profile-section", "compare-section", "inspector-section",
+     "diagnose-section", "phase-section", "unmask-section",
+     "template-section"].forEach(id => {
       const el = $(id);
       if (el) el.style.display = "none";
     });
       ask: "ask-section", recipe: "recipe-section", profile: "profile-section",
       compare: "compare-section", inspector: "inspector-section",
       diagnose: "diagnose-section", phase: "phase-section", unmask: "unmask-section",
+      template: "template-section",
     };
     const sectionId = sectionMap[mode];
     if (sectionId) $(sectionId).style.display = "";
   if (e.key === "Enter") { e.preventDefault(); runUnmaskFromId(); }
 });
+// ════════════════════════════════════════════════════════════════════
+// 📜 Chat-template Sniffer (v0.7.1 anti-bullshit pack #2)
+// ════════════════════════════════════════════════════════════════════
+const TEMPLATE_VERDICT_COLOR = {
+  ok:          "#3fb950",
+  custom:      "#f1c40f",
+  missing:     "#f85149",
+  base_model:  "#8b949e",
+  unknown:     "#8b949e",
+};
+async function fetchHfTokenizerConfig(modelId) {
+  const url = `https://huggingface.co/${modelId}/raw/main/tokenizer_config.json`;
+  const resp = await fetch(url);
+  if (!resp.ok) {
+    if (resp.status === 401 || resp.status === 403) {
+      throw new Error(`Model is gated (${resp.status}). Accept license on HF Hub first.`);
+    }
+    throw new Error(`HTTP ${resp.status} — tokenizer_config.json not found at ${url}`);
+  }
+  return await resp.json();
+}
+function renderTemplateCard(result, modelId = "") {
+  const color = TEMPLATE_VERDICT_COLOR[result.verdict] || TEMPLATE_VERDICT_COLOR.unknown;
+  const escapeHtml = (s) => String(s).replace(/[&<>"']/g, c =>
+    ({"&":"&amp;","<":"&lt;",">":"&gt;",'"':"&quot;","'":"&#39;"}[c]));
+  const verdictLabel = t(`template.verdict.${result.verdict}`) || result.verdict;
+  const labelFamily   = t("template.label.family")   || "Detected family";
+  const labelMarkers  = t("template.label.markers")  || "Matched markers";
+  const labelTplLen   = t("template.label.tpl_len")  || "Template length";
+  const sectionWarn   = t("template.section.warnings") || "Warnings";
+  const sectionCmd    = t("template.section.commands") || "Commands by framework";
+  const sectionRaw    = t("template.section.raw")      || "Raw template (preview)";
+  // Human-readable family name
+  const familyName = result.detectedLabel
+    ? result.detectedLabel
+    : (result.detectedFamily === "custom" ? (t("template.family.custom") || "custom (unknown family)")
+       : (t("template.family.none") || "(no chat_template)"));
+  const warningsHtml = result.warnings.length
+    ? `<details class="unmask-panel" open>
+         <summary class="unmask-panel-title">${sectionWarn}</summary>
+         <ul>${result.warnings.map(w => `<li>${tFmt("template.warn." + w.code, w.params)}</li>`).join("")}</ul>
+       </details>`
+    : "";
+  // Framework commands — only show when we have a chat_template to apply.
+  let cmdHtml = "";
+  if (result.hasChatTemplate) {
+    const lmEvalCmd = "lm_eval --model hf --model_args pretrained=" + (modelId || "MODEL_ID") +
+      " --tasks gsm8k --apply_chat_template --batch_size 8";
+    const vllmCmd = result.vllmTemplate
+      ? `vllm serve ${modelId || "MODEL_ID"} --chat-template ${result.vllmTemplate}`
+      : `vllm serve ${modelId || "MODEL_ID"}  # template auto-detected from tokenizer_config`;
+    const transformersCmd =
+      `from transformers import AutoTokenizer\n` +
+      `tok = AutoTokenizer.from_pretrained("${modelId || "MODEL_ID"}")\n` +
+      `prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)`;
+    cmdHtml = `
+      <details class="unmask-panel" open>
+        <summary class="unmask-panel-title">${sectionCmd}</summary>
+        <div class="template-cmd-block">
+          <div class="template-cmd-label">lm-evaluation-harness</div>
+          <pre class="template-cmd"><code>${escapeHtml(lmEvalCmd)}</code></pre>
+          <div class="template-cmd-label">vLLM serve</div>
+          <pre class="template-cmd"><code>${escapeHtml(vllmCmd)}</code></pre>
+          <div class="template-cmd-label">transformers (Python)</div>
+          <pre class="template-cmd"><code>${escapeHtml(transformersCmd)}</code></pre>
+        </div>
+      </details>
+    `;
+  }
+  // Raw preview only when present
+  const rawHtml = result.rawTemplate
+    ? `<details class="unmask-panel">
+         <summary class="unmask-panel-title">${sectionRaw}</summary>
+         <pre class="template-cmd"><code>${escapeHtml(result.rawTemplate)}</code></pre>
+       </details>`
+    : "";
+  return `
+    <div class="unmask-result">
+      <div class="unmask-hero" style="border-color: ${color};">
+        <div class="unmask-verdict" style="color: ${color};">${verdictLabel}</div>
+        ${modelId ? `<div class="unmask-model"><code>${escapeHtml(modelId)}</code></div>` : ""}
+        <div class="unmask-numbers">
+          <div><span class="unmask-num-label">${labelFamily}</span><span class="unmask-num-val">${escapeHtml(familyName)}</span></div>
+          <div><span class="unmask-num-label">${labelMarkers}</span><span class="unmask-num-val">${result.matchedMarkers.length}</span></div>
+          <div><span class="unmask-num-label">${labelTplLen}</span><span class="unmask-num-val">${result.rawTemplateLength.toLocaleString()}</span></div>
+        </div>
+      </div>
+      <div class="unmask-details">
+        ${warningsHtml}
+        ${cmdHtml}
+        ${rawHtml}
+      </div>
+    </div>
+  `;
+}
+async function runTemplateFromId() {
+  const modelId = ($("template-id").value || "").trim();
+  if (!modelId) {
+    $("template-status").textContent = t("template.status.empty_id") || "⚠ Enter a model id.";
+    return;
+  }
+  $("template-status").textContent = tFmt("template.status.fetching", { modelId });
+  $("template-fetch-btn").disabled = true;
+  try {
+    const cfg = await fetchHfTokenizerConfig(modelId);
+    const result = sniffChatTemplate(cfg);
+    $("template-output").innerHTML = renderTemplateCard(result, modelId);
+    const verdictLocalized = t(`template.verdict.${result.verdict}`) || result.verdict;
+    $("template-status").textContent = tFmt("template.status.success", { modelId, verdict: verdictLocalized });
+  } catch (err) {
+    $("template-status").textContent = `❌ ${err.message}`;
+    $("template-output").innerHTML = "";
+  } finally {
+    $("template-fetch-btn").disabled = false;
+  }
+}
+function runTemplateFromPaste() {
+  const raw = ($("template-paste").value || "").trim();
+  if (!raw) {
+    $("template-status").textContent = t("template.status.empty_paste") || "⚠ Paste a tokenizer_config.json first.";
+    return;
+  }
+  let cfg;
+  try {
+    cfg = JSON.parse(raw);
+  } catch (e) {
+    $("template-status").textContent = tFmt("template.status.invalid_json", { error: e.message });
+    return;
+  }
+  const result = sniffChatTemplate(cfg);
+  const pastedLabel = t("template.pasted_label") || "(pasted config)";
+  $("template-output").innerHTML = renderTemplateCard(result, pastedLabel);
+  const verdictLocalized = t(`template.verdict.${result.verdict}`) || result.verdict;
+  $("template-status").textContent = tFmt("template.status.success_paste", { verdict: verdictLocalized });
+}
+$("template-fetch-btn")?.addEventListener("click", runTemplateFromId);
+$("template-paste-btn")?.addEventListener("click", runTemplateFromPaste);
+$("template-id")?.addEventListener("keydown", (e) => {
+  if (e.key === "Enter") { e.preventDefault(); runTemplateFromId(); }
+});
 function configToPreset(cfg, modelId) {
   const n_attn = cfg.num_attention_heads || cfg.n_head || 0;
   const n_kv = cfg.num_key_value_heads || cfg.num_attention_heads || cfg.n_head || 0;

style.css CHANGED Viewed

@@ -33,6 +33,33 @@
   flex: 1;
 }
 /* v0.7.0 — Unmask mode (SWA + RoPE-scaling detector) */
 .unmask-result {
   margin-top: 0.8em;

   flex: 1;
 }
+/* v0.7.1 — Chat-template Sniffer mode */
+.template-cmd-block {
+  display: flex;
+  flex-direction: column;
+  gap: 0.5em;
+}
+.template-cmd-label {
+  font-size: 0.78em;
+  font-weight: 600;
+  color: #58a6ff;
+  text-transform: uppercase;
+  letter-spacing: 0.04em;
+  margin-top: 0.4em;
+}
+.template-cmd {
+  margin: 0;
+  padding: 0.6em 0.8em;
+  background: rgba(0, 0, 0, 0.35);
+  border: 1px solid rgba(255, 255, 255, 0.06);
+  border-radius: 6px;
+  font-family: monospace;
+  font-size: 0.85em;
+  line-height: 1.45;
+  white-space: pre-wrap;
+  overflow-x: auto;
+}
 /* v0.7.0 — Unmask mode (SWA + RoPE-scaling detector) */
 .unmask-result {
   margin-top: 0.8em;