Spaces:
Running
v0.8.4 Prompt-Cache Diff Predictor — anti-bullshit pack #10
Browse filesProvider prompt caches each have different rules:
- Anthropic `cache_control` breaks at first token diff in marked prefix
- OpenAI auto-caches prefixes ≥1024 tokens; invalidates on any change
- Gemini context cache requires ≥32K tokens
A misplaced edit silently 10x's the bill — the API never warns, and the
cost only shows up on the next invoice. No public tool predicts this.
🔁 Cache Diff (18th mode):
- Two textareas: paste old + new prompt
- Tokenizer profile selector (English / code / CJK) since shipping
a real BPE in browser would mean 5-10MB WASM. Char-per-token
heuristic is robust to estimator drift because cache savings are
a RATIO, not absolute counts.
- Output: per-provider table (Claude Opus 4.7 / Sonnet 4.6 / Haiku
4.5 / GPT-5 / GPT-5 mini / Gemini 2.5 Pro) with hit ratio,
base→cached cost, savings $ + %, TTL note, marker requirement.
- Anthropic 25% write surcharge surfaced as separate row so users
see the amortization picture, not just the steady-state savings.
- Diff visualization: green common prefix + red divergent suffix
side-by-side with first-difference line number.
- Three examples: 99% hit (small Q&A edit) / cache busted (system
prompt edit) / below OpenAI min (short prompt).
Pure logic in `js/prompt_cache_diff.js` (codes + params, no human
strings); main.js renders with i18n. 41 i18n keys × 4 langs (EN/ES/FR/
ZH) = 164 keys, parity clean. Help modal v0.8.4 entry + Inventory
anti-bullshit-pack list + "Set up an eval correctly" task tile.
Pricing snapshot 2026-01 baked in with explicit "verify against current
docs" disclaimer in the attribution footer.
Source citations:
- https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
- https://platform.openai.com/docs/guides/prompt-caching
- https://ai.google.dev/gemini-api/docs/caching
Verified: 5/5 logic cases (identical / small edit / front edit /
below-min / empty) + cost-arithmetic sanity (Anthropic 42% savings on
2K-tok prefix, OpenAI 30%, Gemini correctly rejects below-32K) +
164/164 i18n parity + headless e2e (tab/section/3 examples, providers
visible, below-min note rendered). 19 mode tabs total.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- index.html +46 -0
- js/i18n.js +172 -0
- js/main.js +199 -1
- js/prompt_cache_diff.js +308 -0
|
@@ -222,6 +222,9 @@
|
|
| 222 |
<p><strong data-i18n="help.v083.peft.title">🔧 PEFT Anti-Pattern Checker</strong></p>
|
| 223 |
<p data-i18n="help.v083.peft.body">PEFT's <code>get_peft_model(base, config)</code> creates a FRESH adapter — it does not load saved weights from a path. Users who paste tutorial code and try to resume from a checkpoint silently throw away their training. peft #2115 has the canonical bug report. This linter scans your training script for the pattern + 3 related issues (QLoRA ordering, target_modules/arch mismatch, lora_alpha ratio) and reports findings with line numbers and suggested fixes. <em>Use case</em>: before you launch a 10-hour LoRA fine-tune, paste your script — catch the silent bugs in 200ms.</p>
|
| 224 |
|
|
|
|
|
|
|
|
|
|
| 225 |
<p><strong data-i18n="help.v081.hub.title">🧭 Solutions Hub</strong></p>
|
| 226 |
<p data-i18n="help.v081.hub.body">tafagent as integrator, not silo. 30+ pains across 7 categories (eval reliability · diagnostics · setup · training · retrieval · multimodal · observability), each mapped to (a) the tafagent mode that addresses it, if any, and (b) the best-of-breed external tools the community already trusts (RAGAS, MTEB, HELM, MCP Schema Validator, llm-stats, llguidance, GlitchMiner, etc.). Search box matches across pain, scenario, and tool name. <em>Use case</em>: 'I have problem X — does tafagent solve it, and if not, who does?'</p>
|
| 227 |
|
|
@@ -336,6 +339,7 @@
|
|
| 336 |
<li data-i18n="inv.v08.saturation"><strong>📈 Saturation</strong> — is your benchmark still useful, or are all frontier models tied at the top?</li>
|
| 337 |
<li data-i18n="inv.v082.cot"><strong>📋 JSON CoT</strong> — lints structured-output schemas for the answer-before-reasoning anti-pattern that silently breaks Chain-of-Thought.</li>
|
| 338 |
<li data-i18n="inv.v083.peft"><strong>🔧 PEFT Lint</strong> — catches the silent <code>get_peft_model</code> base-load (peft #2115) + QLoRA order + target_modules / arch mismatch.</li>
|
|
|
|
| 339 |
<li data-i18n="inv.v081.hub"><strong>🧭 Solutions Hub</strong> — every documented pain mapped to a tafagent mode or curated external tool. Don't reinvent — find.</li>
|
| 340 |
</ul>
|
| 341 |
</details>
|
|
@@ -409,6 +413,7 @@
|
|
| 409 |
<button data-mode-link="diagnose" data-i18n="modes.diagnose">🩺 Diagnose CLI</button>
|
| 410 |
<button data-mode-link="cot" data-i18n="modes.cot">📋 JSON CoT</button>
|
| 411 |
<button data-mode-link="peft" data-i18n="modes.peft">🔧 PEFT Lint</button>
|
|
|
|
| 412 |
</div>
|
| 413 |
</div>
|
| 414 |
<div class="task-tile">
|
|
@@ -467,6 +472,7 @@
|
|
| 467 |
<button class="mode-btn" data-mode="saturation" role="tab" aria-selected="false" data-i18n="modes.saturation">📈 Saturation</button>
|
| 468 |
<button class="mode-btn" data-mode="cot" role="tab" aria-selected="false" data-i18n="modes.cot">📋 JSON CoT</button>
|
| 469 |
<button class="mode-btn" data-mode="peft" role="tab" aria-selected="false" data-i18n="modes.peft">🔧 PEFT Lint</button>
|
|
|
|
| 470 |
<button class="mode-btn" data-mode="hub" role="tab" aria-selected="false" data-i18n="modes.hub">🧭 Solutions</button>
|
| 471 |
</div>
|
| 472 |
<p id="mode-desc" class="recipe-desc" data-i18n="modes.desc">
|
|
@@ -1061,6 +1067,46 @@
|
|
| 1061 |
<div id="peft-output" style="margin-top: 1em;"></div>
|
| 1062 |
</section>
|
| 1063 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1064 |
<section id="hub-section" style="display:none;">
|
| 1065 |
<h2><span data-i18n="hub.title">🧭 Solutions Hub</span>
|
| 1066 |
<span class="info"><span class="tooltip" data-i18n="hub.tip">
|
|
|
|
| 222 |
<p><strong data-i18n="help.v083.peft.title">🔧 PEFT Anti-Pattern Checker</strong></p>
|
| 223 |
<p data-i18n="help.v083.peft.body">PEFT's <code>get_peft_model(base, config)</code> creates a FRESH adapter — it does not load saved weights from a path. Users who paste tutorial code and try to resume from a checkpoint silently throw away their training. peft #2115 has the canonical bug report. This linter scans your training script for the pattern + 3 related issues (QLoRA ordering, target_modules/arch mismatch, lora_alpha ratio) and reports findings with line numbers and suggested fixes. <em>Use case</em>: before you launch a 10-hour LoRA fine-tune, paste your script — catch the silent bugs in 200ms.</p>
|
| 224 |
|
| 225 |
+
<p><strong data-i18n="help.v084.cache.title">🔁 Prompt-Cache Diff Predictor</strong></p>
|
| 226 |
+
<p data-i18n="help.v084.cache.body">Provider prompt caches each have different rules: Anthropic's <code>cache_control</code> breaks at the first token diff in the marked prefix; OpenAI auto-caches prefixes ≥1024 tokens; Gemini context caches require ≥32K tokens. A misplaced edit silently 10x's your bill — the API never warns you, and the cost only shows up on the next invoice. Paste old + new prompt, the predictor finds the longest common prefix, estimates tokens with three tokenizer profiles (English / code / CJK), and shows per-provider hit ratio + $ delta vs no-cache for Claude Opus/Sonnet/Haiku, GPT-5/mini, and Gemini 2.5 Pro. <em>Use case</em>: 'I tweaked the system prompt and the bill jumped — what broke?' → paste both prompts, see exactly which provider stopped caching.</p>
|
| 227 |
+
|
| 228 |
<p><strong data-i18n="help.v081.hub.title">🧭 Solutions Hub</strong></p>
|
| 229 |
<p data-i18n="help.v081.hub.body">tafagent as integrator, not silo. 30+ pains across 7 categories (eval reliability · diagnostics · setup · training · retrieval · multimodal · observability), each mapped to (a) the tafagent mode that addresses it, if any, and (b) the best-of-breed external tools the community already trusts (RAGAS, MTEB, HELM, MCP Schema Validator, llm-stats, llguidance, GlitchMiner, etc.). Search box matches across pain, scenario, and tool name. <em>Use case</em>: 'I have problem X — does tafagent solve it, and if not, who does?'</p>
|
| 230 |
|
|
|
|
| 339 |
<li data-i18n="inv.v08.saturation"><strong>📈 Saturation</strong> — is your benchmark still useful, or are all frontier models tied at the top?</li>
|
| 340 |
<li data-i18n="inv.v082.cot"><strong>📋 JSON CoT</strong> — lints structured-output schemas for the answer-before-reasoning anti-pattern that silently breaks Chain-of-Thought.</li>
|
| 341 |
<li data-i18n="inv.v083.peft"><strong>🔧 PEFT Lint</strong> — catches the silent <code>get_peft_model</code> base-load (peft #2115) + QLoRA order + target_modules / arch mismatch.</li>
|
| 342 |
+
<li data-i18n="inv.v084.cache"><strong>🔁 Cache Diff</strong> — predicts whether a prompt edit invalidated the provider's prompt cache. Per-provider hit ratio + $ delta.</li>
|
| 343 |
<li data-i18n="inv.v081.hub"><strong>🧭 Solutions Hub</strong> — every documented pain mapped to a tafagent mode or curated external tool. Don't reinvent — find.</li>
|
| 344 |
</ul>
|
| 345 |
</details>
|
|
|
|
| 413 |
<button data-mode-link="diagnose" data-i18n="modes.diagnose">🩺 Diagnose CLI</button>
|
| 414 |
<button data-mode-link="cot" data-i18n="modes.cot">📋 JSON CoT</button>
|
| 415 |
<button data-mode-link="peft" data-i18n="modes.peft">🔧 PEFT Lint</button>
|
| 416 |
+
<button data-mode-link="cache" data-i18n="modes.cache">🔁 Cache Diff</button>
|
| 417 |
</div>
|
| 418 |
</div>
|
| 419 |
<div class="task-tile">
|
|
|
|
| 472 |
<button class="mode-btn" data-mode="saturation" role="tab" aria-selected="false" data-i18n="modes.saturation">📈 Saturation</button>
|
| 473 |
<button class="mode-btn" data-mode="cot" role="tab" aria-selected="false" data-i18n="modes.cot">📋 JSON CoT</button>
|
| 474 |
<button class="mode-btn" data-mode="peft" role="tab" aria-selected="false" data-i18n="modes.peft">🔧 PEFT Lint</button>
|
| 475 |
+
<button class="mode-btn" data-mode="cache" role="tab" aria-selected="false" data-i18n="modes.cache">🔁 Cache Diff</button>
|
| 476 |
<button class="mode-btn" data-mode="hub" role="tab" aria-selected="false" data-i18n="modes.hub">🧭 Solutions</button>
|
| 477 |
</div>
|
| 478 |
<p id="mode-desc" class="recipe-desc" data-i18n="modes.desc">
|
|
|
|
| 1067 |
<div id="peft-output" style="margin-top: 1em;"></div>
|
| 1068 |
</section>
|
| 1069 |
|
| 1070 |
+
<!-- Prompt-Cache Diff Predictor (mode=cache, v0.8.4 anti-bullshit pack #10) -->
|
| 1071 |
+
<section id="cache-section" style="display:none;">
|
| 1072 |
+
<h2><span data-i18n="cache.title">🔁 Prompt-Cache Diff Predictor</span>
|
| 1073 |
+
<span class="info"><span class="tooltip" data-i18n="cache.tip">
|
| 1074 |
+
<strong>Why this matters</strong>: Anthropic's `cache_control` cache breaks at the first token diff in the marked prefix. OpenAI auto-caches prefixes ≥1024 tokens but invalidates on any change. Gemini context cache requires ≥32K tokens. A misplaced edit silently 10x's your bill — and the API never warns you. Paste old + new prompt, see per-provider hit ratio + cost delta.
|
| 1075 |
+
</span></span>
|
| 1076 |
+
</h2>
|
| 1077 |
+
<p class="recipe-desc" data-i18n="cache.desc">
|
| 1078 |
+
<strong>Don't 10x your bill on a one-character edit.</strong> Paste your previous and current prompt — the predictor finds the longest common prefix, estimates tokens, and shows per-provider cache hit ratio + $ delta vs no-cache.
|
| 1079 |
+
</p>
|
| 1080 |
+
<div class="form-row" style="display:flex; gap:1em; flex-wrap:wrap;">
|
| 1081 |
+
<div style="flex:1; min-width:300px;">
|
| 1082 |
+
<label for="cache-old" data-i18n="cache.old_label">Old prompt:</label>
|
| 1083 |
+
<textarea id="cache-old" rows="10" style="width:100%;font-family:monospace;font-size:0.85em;" data-i18n-placeholder="cache.old.placeholder" placeholder="You are a helpful assistant. …"></textarea>
|
| 1084 |
+
</div>
|
| 1085 |
+
<div style="flex:1; min-width:300px;">
|
| 1086 |
+
<label for="cache-new" data-i18n="cache.new_label">New prompt:</label>
|
| 1087 |
+
<textarea id="cache-new" rows="10" style="width:100%;font-family:monospace;font-size:0.85em;" data-i18n-placeholder="cache.new.placeholder" placeholder="You are a helpful assistant. …"></textarea>
|
| 1088 |
+
</div>
|
| 1089 |
+
</div>
|
| 1090 |
+
<div class="form-row">
|
| 1091 |
+
<label for="cache-profile" data-i18n="cache.profile_label">Tokenizer profile:</label>
|
| 1092 |
+
<select id="cache-profile">
|
| 1093 |
+
<option value="english" data-i18n="cache.profile.english">English (chars/4)</option>
|
| 1094 |
+
<option value="code" data-i18n="cache.profile.code">Code (chars/3.5)</option>
|
| 1095 |
+
<option value="mixed" data-i18n="cache.profile.mixed">CJK / Cyrillic (chars/2)</option>
|
| 1096 |
+
</select>
|
| 1097 |
+
<label for="cache-output-tokens" data-i18n="cache.output_label">Estimated output tokens:</label>
|
| 1098 |
+
<input type="number" id="cache-output-tokens" value="500" min="0" max="100000" style="width:8em;" />
|
| 1099 |
+
</div>
|
| 1100 |
+
<div class="form-row">
|
| 1101 |
+
<button type="button" id="cache-diff-btn" data-i18n="cache.diff_btn">🔍 Predict</button>
|
| 1102 |
+
<button type="button" id="cache-example-good-btn" class="secondary" data-i18n="cache.example_good_btn">↳ Example: 99% hit</button>
|
| 1103 |
+
<button type="button" id="cache-example-broken-btn" class="secondary" data-i18n="cache.example_broken_btn">↳ Example: cache busted</button>
|
| 1104 |
+
<button type="button" id="cache-example-belowmin-btn" class="secondary" data-i18n="cache.example_belowmin_btn">↳ Example: below OpenAI min</button>
|
| 1105 |
+
</div>
|
| 1106 |
+
<p id="cache-status" class="recipe-desc" style="font-size:0.92em;"></p>
|
| 1107 |
+
<div id="cache-output" style="margin-top: 1em;"></div>
|
| 1108 |
+
</section>
|
| 1109 |
+
|
| 1110 |
<section id="hub-section" style="display:none;">
|
| 1111 |
<h2><span data-i18n="hub.title">🧭 Solutions Hub</span>
|
| 1112 |
<span class="info"><span class="tooltip" data-i18n="hub.tip">
|
|
@@ -594,6 +594,49 @@ export const TRANSLATIONS = {
|
|
| 594 |
"help.v083.peft.title": "🔧 PEFT Anti-Pattern Checker",
|
| 595 |
"help.v083.peft.body": "PEFT's <code>get_peft_model(base, config)</code> creates a FRESH adapter — it does not load saved weights from a path. Users who paste tutorial code and try to resume from a checkpoint silently throw away their training. peft #2115 has the canonical bug report. This linter scans your training script for the pattern + 3 related issues (QLoRA ordering, target_modules/arch mismatch, lora_alpha ratio) and reports findings with line numbers and suggested fixes. <em>Use case</em>: before you launch a 10-hour LoRA fine-tune, paste your script — catch the silent bugs in 200ms.",
|
| 596 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 597 |
"inv.v081.hub": "<strong>🧭 Solutions Hub</strong> — every documented pain mapped to a tafagent mode or curated external tool. Don't reinvent — find.",
|
| 598 |
"help.v081.hub.title": "🧭 Solutions Hub",
|
| 599 |
"help.v081.hub.body": "tafagent as integrator, not silo. 30+ pains across 7 categories (eval reliability · diagnostics · setup · training · retrieval · multimodal · observability), each mapped to (a) the tafagent mode that addresses it, if any, and (b) the best-of-breed external tools the community already trusts (RAGAS, MTEB, HELM, MCP Schema Validator, llm-stats, llguidance, GlitchMiner, etc.). Search box matches across pain, scenario, and tool name. <em>Use case</em>: 'I have problem X — does tafagent solve it, and if not, who does?'",
|
|
@@ -1647,6 +1690,49 @@ export const TRANSLATIONS = {
|
|
| 1647 |
"help.v083.peft.title": "🔧 Verificador de anti-patrones PEFT",
|
| 1648 |
"help.v083.peft.body": "El <code>get_peft_model(base, config)</code> de PEFT crea un adapter NUEVO — no carga pesos guardados desde una ruta. Quien pega código de tutorial e intenta reanudar desde un checkpoint tira silenciosamente su entrenamiento. peft #2115 tiene el bug report canónico. Este linter escanea tu script buscando el patrón + 3 issues relacionados (orden QLoRA, mismatch target_modules/arch, ratio lora_alpha) y reporta hallazgos con números de línea y sugerencias. <em>Caso de uso</em>: antes de lanzar un fine-tune LoRA de 10 horas, pega tu script — atrapa los bugs silenciosos en 200ms.",
|
| 1649 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1650 |
"inv.v081.hub": "<strong>🧭 Solutions Hub</strong> — cada pain documentado mapeado a un mode tafagent o herramienta externa curada. No reinventes — encuentra.",
|
| 1651 |
"help.v081.hub.title": "🧭 Solutions Hub",
|
| 1652 |
"help.v081.hub.body": "tafagent como integrador, no silo. 30+ pains en 7 categorías (eval reliability · diagnósticos · setup · training · retrieval · multimodal · observability), cada uno mapeado a (a) el mode tafagent que lo resuelve, si existe, y (b) las herramientas externas best-of-breed que la comunidad ya usa (RAGAS, MTEB, HELM, MCP Schema Validator, llm-stats, llguidance, GlitchMiner, etc.). Caja de búsqueda matchea pain, scenario, y nombre de herramienta. <em>Caso de uso</em>: 'tengo problema X — ¿lo resuelve tafagent, y si no, quién?'",
|
|
@@ -2564,6 +2650,49 @@ export const TRANSLATIONS = {
|
|
| 2564 |
"help.v083.peft.title": "🔧 Vérificateur d'anti-patterns PEFT",
|
| 2565 |
"help.v083.peft.body": "Le <code>get_peft_model(base, config)</code> de PEFT crée un NOUVEL adaptateur — il ne charge pas les poids sauvegardés depuis un chemin. Quiconque colle du code de tuto et essaie de reprendre depuis un checkpoint jette silencieusement son entraînement. peft #2115 contient le bug report canonique. Ce linter scanne votre script à la recherche du pattern + 3 problèmes liés (ordre QLoRA, mismatch target_modules/arch, ratio lora_alpha) et rapporte les découvertes avec numéros de ligne et corrections suggérées. <em>Cas d'usage</em> : avant de lancer un fine-tune LoRA de 10 heures, collez votre script — attrapez les bugs silencieux en 200ms.",
|
| 2566 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2567 |
"inv.v081.hub": "<strong>🧭 Solutions Hub</strong> — chaque pain documenté mappé à un mode tafagent ou outil externe curé. Ne réinventez pas — trouvez.",
|
| 2568 |
"help.v081.hub.title": "🧭 Solutions Hub",
|
| 2569 |
"help.v081.hub.body": "tafagent comme intégrateur, pas silo. 30+ pains à travers 7 catégories (eval reliability · diagnostics · setup · training · retrieval · multimodal · observability), chacun mappé à (a) le mode tafagent qui le résout, s'il existe, et (b) les outils externes best-of-breed que la communauté utilise déjà (RAGAS, MTEB, HELM, MCP Schema Validator, llm-stats, llguidance, GlitchMiner, etc.). La barre de recherche matche pain, scénario, et nom d'outil. <em>Cas d'usage</em> : 'j'ai le problème X — tafagent le résout-il, et sinon, qui ?'",
|
|
@@ -3481,6 +3610,49 @@ export const TRANSLATIONS = {
|
|
| 3481 |
"help.v083.peft.title": "🔧 PEFT 反模式检查器",
|
| 3482 |
"help.v083.peft.body": "PEFT 的 <code>get_peft_model(base, config)</code> 创建一个新的 adapter——它不从路径加载已保存的权重。粘贴教程代码并尝试从 checkpoint 恢复的人会静默地丢掉训练。peft #2115 是规范的 bug 报告。这个 linter 扫描你的脚本查找该模式 + 3 个相关问题(QLoRA 顺序、target_modules/架构不匹配、lora_alpha 比率),并报告带行号和建议修复的发现。<em>用例</em>:在启动 10 小时的 LoRA fine-tune 之前,粘贴你的脚本——在 200ms 内捕获静默 bug。",
|
| 3483 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3484 |
"inv.v081.hub": "<strong>🧭 Solutions Hub</strong> — 每个文档化的问题都映射到一个 tafagent 模式或精选外部工具。别重复发明 — 去找。",
|
| 3485 |
"help.v081.hub.title": "🧭 Solutions Hub",
|
| 3486 |
"help.v081.hub.body": "tafagent 作为集成者而非孤岛。30+ 问题跨 7 类别(评估可靠性 · 诊断 · 设置 · 训练 · 检索 · 多模态 · 可观测性),每个映射到(a)解决它的 tafagent 模式(若存在),以及(b)社区已信任的最佳外部工具(RAGAS、MTEB、HELM、MCP Schema Validator、llm-stats、llguidance、GlitchMiner 等)。搜索框匹配 pain、场景和工具名称。<em>用例</em>:'我有问题 X — tafagent 解决它吗,如果不,谁解决?'",
|
|
|
|
| 594 |
"help.v083.peft.title": "🔧 PEFT Anti-Pattern Checker",
|
| 595 |
"help.v083.peft.body": "PEFT's <code>get_peft_model(base, config)</code> creates a FRESH adapter — it does not load saved weights from a path. Users who paste tutorial code and try to resume from a checkpoint silently throw away their training. peft #2115 has the canonical bug report. This linter scans your training script for the pattern + 3 related issues (QLoRA ordering, target_modules/arch mismatch, lora_alpha ratio) and reports findings with line numbers and suggested fixes. <em>Use case</em>: before you launch a 10-hour LoRA fine-tune, paste your script — catch the silent bugs in 200ms.",
|
| 596 |
|
| 597 |
+
// v0.8.4 — anti-bullshit pack #10: Prompt-Cache Diff Predictor
|
| 598 |
+
"modes.cache": "🔁 Cache Diff",
|
| 599 |
+
"mode_desc.cache": "Predicts whether a prompt edit kept the provider's prompt cache alive or invalidated it. Per-provider hit ratio + $ delta vs no-cache.",
|
| 600 |
+
"cache.title": "🔁 Prompt-Cache Diff Predictor",
|
| 601 |
+
"cache.tip": "Anthropic's <code>cache_control</code> cache breaks at the first token diff in the marked prefix. OpenAI auto-caches prefixes ≥1024 tokens but invalidates on any change. Gemini context cache requires ≥32K tokens. A misplaced edit silently 10x's your bill — and the API never warns you. Paste old + new prompt, see per-provider hit ratio + cost delta.",
|
| 602 |
+
"cache.desc": "<strong>Don't 10x your bill on a one-character edit.</strong> Paste your previous and current prompt — the predictor finds the longest common prefix, estimates tokens, and shows per-provider cache hit ratio + $ delta vs no-cache.",
|
| 603 |
+
"cache.old_label": "Old prompt:",
|
| 604 |
+
"cache.new_label": "New prompt:",
|
| 605 |
+
"cache.old.placeholder": "You are a helpful assistant. …",
|
| 606 |
+
"cache.new.placeholder": "You are a helpful assistant. …",
|
| 607 |
+
"cache.profile_label": "Tokenizer profile:",
|
| 608 |
+
"cache.profile.english": "English (chars/4)",
|
| 609 |
+
"cache.profile.code": "Code (chars/3.5)",
|
| 610 |
+
"cache.profile.mixed": "CJK / Cyrillic (chars/2)",
|
| 611 |
+
"cache.output_label": "Estimated output tokens:",
|
| 612 |
+
"cache.diff_btn": "🔍 Predict",
|
| 613 |
+
"cache.example_good_btn": "↳ Example: 99% hit",
|
| 614 |
+
"cache.example_broken_btn": "↳ Example: cache busted",
|
| 615 |
+
"cache.example_belowmin_btn": "↳ Example: below OpenAI min",
|
| 616 |
+
"cache.status.done": "✅ {verdict} — {hit}% theoretical hit",
|
| 617 |
+
"cache.verdict.identical": "✅ Identical — full cache hit",
|
| 618 |
+
"cache.verdict.divergent_can_cache":"⚠ Partial cache hit — providers vary",
|
| 619 |
+
"cache.verdict.divergent_below_min":"❌ Below all provider minimums — no caching possible",
|
| 620 |
+
"cache.verdict.fully_divergent": "❌ Fully divergent — cache invalidated",
|
| 621 |
+
"cache.verdict.empty_input": "ℹ Empty input",
|
| 622 |
+
"cache.summary.tokens": "Common prefix {common} / {total} tokens ({pct}% theoretical hit ratio).",
|
| 623 |
+
"cache.summary.diff_at": "First difference at line {line}.",
|
| 624 |
+
"cache.col.provider": "Provider",
|
| 625 |
+
"cache.col.hit": "Hit",
|
| 626 |
+
"cache.col.cost": "Base → cached",
|
| 627 |
+
"cache.col.savings": "Savings",
|
| 628 |
+
"cache.note.requires_marker": "(requires cache_control marker)",
|
| 629 |
+
"cache.note.below_min": "(prefix < {min} tokens — provider min)",
|
| 630 |
+
"cache.write_surcharge": "+ {cost} cache-write surcharge first time (Anthropic)",
|
| 631 |
+
"cache.diff.title": "Where the cache breaks",
|
| 632 |
+
"cache.diff.legend": "Green = shared prefix (cacheable). Red = first edit (everything from here is re-billed).",
|
| 633 |
+
"cache.hint.empty": "Paste two prompts, then Predict.",
|
| 634 |
+
"cache.attribution": "Refs:",
|
| 635 |
+
"cache.attribution.snapshot": "Prices snapshot 2026-01; verify against current provider docs before acting on $.",
|
| 636 |
+
"inv.v084.cache": "<strong>🔁 Cache Diff</strong> — predicts whether a prompt edit invalidated the provider's prompt cache. Per-provider hit ratio + $ delta.",
|
| 637 |
+
"help.v084.cache.title": "🔁 Prompt-Cache Diff Predictor",
|
| 638 |
+
"help.v084.cache.body": "Provider prompt caches each have different rules: Anthropic's <code>cache_control</code> breaks at the first token diff in the marked prefix; OpenAI auto-caches prefixes ≥1024 tokens; Gemini context caches require ≥32K tokens. A misplaced edit silently 10x's your bill — the API never warns you, and the cost only shows up on the next invoice. Paste old + new prompt, the predictor finds the longest common prefix, estimates tokens with three tokenizer profiles (English / code / CJK), and shows per-provider hit ratio + $ delta vs no-cache for Claude Opus/Sonnet/Haiku, GPT-5/mini, and Gemini 2.5 Pro. <em>Use case</em>: 'I tweaked the system prompt and the bill jumped — what broke?' → paste both prompts, see exactly which provider stopped caching.",
|
| 639 |
+
|
| 640 |
"inv.v081.hub": "<strong>🧭 Solutions Hub</strong> — every documented pain mapped to a tafagent mode or curated external tool. Don't reinvent — find.",
|
| 641 |
"help.v081.hub.title": "🧭 Solutions Hub",
|
| 642 |
"help.v081.hub.body": "tafagent as integrator, not silo. 30+ pains across 7 categories (eval reliability · diagnostics · setup · training · retrieval · multimodal · observability), each mapped to (a) the tafagent mode that addresses it, if any, and (b) the best-of-breed external tools the community already trusts (RAGAS, MTEB, HELM, MCP Schema Validator, llm-stats, llguidance, GlitchMiner, etc.). Search box matches across pain, scenario, and tool name. <em>Use case</em>: 'I have problem X — does tafagent solve it, and if not, who does?'",
|
|
|
|
| 1690 |
"help.v083.peft.title": "🔧 Verificador de anti-patrones PEFT",
|
| 1691 |
"help.v083.peft.body": "El <code>get_peft_model(base, config)</code> de PEFT crea un adapter NUEVO — no carga pesos guardados desde una ruta. Quien pega código de tutorial e intenta reanudar desde un checkpoint tira silenciosamente su entrenamiento. peft #2115 tiene el bug report canónico. Este linter escanea tu script buscando el patrón + 3 issues relacionados (orden QLoRA, mismatch target_modules/arch, ratio lora_alpha) y reporta hallazgos con números de línea y sugerencias. <em>Caso de uso</em>: antes de lanzar un fine-tune LoRA de 10 horas, pega tu script — atrapa los bugs silenciosos en 200ms.",
|
| 1692 |
|
| 1693 |
+
// v0.8.4 — anti-bullshit pack #10: Prompt-Cache Diff Predictor
|
| 1694 |
+
"modes.cache": "🔁 Cache Diff",
|
| 1695 |
+
"mode_desc.cache": "Predice si una edición del prompt mantuvo viva la prompt cache del proveedor o la invalidó. Hit ratio por proveedor + delta $ vs sin caché.",
|
| 1696 |
+
"cache.title": "🔁 Predictor de Diff de Prompt-Cache",
|
| 1697 |
+
"cache.tip": "El <code>cache_control</code> de Anthropic se rompe al primer token diferente del prefijo marcado. OpenAI auto-cachea prefijos ≥1024 tokens pero invalida ante cualquier cambio. La context cache de Gemini requiere ≥32K tokens. Una edición mal puesta silenciosamente 10x tu factura — y la API nunca avisa. Pega prompt viejo + nuevo, ve el hit ratio por proveedor + delta de coste.",
|
| 1698 |
+
"cache.desc": "<strong>No 10x tu factura por un edit de un carácter.</strong> Pega tu prompt anterior y el actual — el predictor halla el prefijo común más largo, estima tokens, y muestra hit ratio por proveedor + delta $ vs sin caché.",
|
| 1699 |
+
"cache.old_label": "Prompt viejo:",
|
| 1700 |
+
"cache.new_label": "Prompt nuevo:",
|
| 1701 |
+
"cache.old.placeholder": "Eres un asistente útil. …",
|
| 1702 |
+
"cache.new.placeholder": "Eres un asistente útil. …",
|
| 1703 |
+
"cache.profile_label": "Perfil de tokenizer:",
|
| 1704 |
+
"cache.profile.english": "Inglés (chars/4)",
|
| 1705 |
+
"cache.profile.code": "Código (chars/3.5)",
|
| 1706 |
+
"cache.profile.mixed": "CJK / Cirílico (chars/2)",
|
| 1707 |
+
"cache.output_label": "Tokens de salida estimados:",
|
| 1708 |
+
"cache.diff_btn": "🔍 Predecir",
|
| 1709 |
+
"cache.example_good_btn": "↳ Ejemplo: hit 99%",
|
| 1710 |
+
"cache.example_broken_btn": "↳ Ejemplo: caché rota",
|
| 1711 |
+
"cache.example_belowmin_btn": "↳ Ejemplo: bajo mínimo OpenAI",
|
| 1712 |
+
"cache.status.done": "✅ {verdict} — {hit}% hit teórico",
|
| 1713 |
+
"cache.verdict.identical": "✅ Idénticos — hit completo",
|
| 1714 |
+
"cache.verdict.divergent_can_cache":"⚠ Hit parcial — varía por proveedor",
|
| 1715 |
+
"cache.verdict.divergent_below_min":"❌ Por debajo de mínimos — no hay caché posible",
|
| 1716 |
+
"cache.verdict.fully_divergent": "❌ Totalmente divergentes — caché invalidada",
|
| 1717 |
+
"cache.verdict.empty_input": "ℹ Entrada vacía",
|
| 1718 |
+
"cache.summary.tokens": "Prefijo común {common} / {total} tokens ({pct}% hit ratio teórico).",
|
| 1719 |
+
"cache.summary.diff_at": "Primera diferencia en la línea {line}.",
|
| 1720 |
+
"cache.col.provider": "Proveedor",
|
| 1721 |
+
"cache.col.hit": "Hit",
|
| 1722 |
+
"cache.col.cost": "Base → cached",
|
| 1723 |
+
"cache.col.savings": "Ahorro",
|
| 1724 |
+
"cache.note.requires_marker": "(requiere marcador cache_control)",
|
| 1725 |
+
"cache.note.below_min": "(prefijo < {min} tokens — mínimo del proveedor)",
|
| 1726 |
+
"cache.write_surcharge": "+ {cost} sobrecargo de cache-write la primera vez (Anthropic)",
|
| 1727 |
+
"cache.diff.title": "Dónde se rompe la caché",
|
| 1728 |
+
"cache.diff.legend": "Verde = prefijo compartido (cacheable). Rojo = primera edición (todo desde aquí se re-factura).",
|
| 1729 |
+
"cache.hint.empty": "Pega dos prompts, luego Predecir.",
|
| 1730 |
+
"cache.attribution": "Referencias:",
|
| 1731 |
+
"cache.attribution.snapshot": "Precios snapshot 2026-01; verifica con la doc actual del proveedor antes de actuar sobre $.",
|
| 1732 |
+
"inv.v084.cache": "<strong>🔁 Cache Diff</strong> — predice si un edit del prompt invalidó la prompt cache del proveedor. Hit ratio por proveedor + delta $.",
|
| 1733 |
+
"help.v084.cache.title": "🔁 Predictor de Diff de Prompt-Cache",
|
| 1734 |
+
"help.v084.cache.body": "Las prompt caches de cada proveedor tienen reglas distintas: el <code>cache_control</code> de Anthropic se rompe al primer token diferente del prefijo marcado; OpenAI auto-cachea prefijos ≥1024 tokens; las context caches de Gemini requieren ≥32K tokens. Una edición mal puesta silenciosamente 10x tu factura — la API no avisa, y el coste solo aparece en la siguiente factura. Pega prompt viejo + nuevo, el predictor halla el prefijo común más largo, estima tokens con tres perfiles de tokenizer (inglés / código / CJK), y muestra hit ratio por proveedor + delta $ vs sin caché para Claude Opus/Sonnet/Haiku, GPT-5/mini, y Gemini 2.5 Pro. <em>Caso de uso</em>: 'Tweaké el system prompt y la factura saltó — ¿qué se rompió?' → pega ambos prompts, ve exactamente qué proveedor dejó de cachear.",
|
| 1735 |
+
|
| 1736 |
"inv.v081.hub": "<strong>🧭 Solutions Hub</strong> — cada pain documentado mapeado a un mode tafagent o herramienta externa curada. No reinventes — encuentra.",
|
| 1737 |
"help.v081.hub.title": "🧭 Solutions Hub",
|
| 1738 |
"help.v081.hub.body": "tafagent como integrador, no silo. 30+ pains en 7 categorías (eval reliability · diagnósticos · setup · training · retrieval · multimodal · observability), cada uno mapeado a (a) el mode tafagent que lo resuelve, si existe, y (b) las herramientas externas best-of-breed que la comunidad ya usa (RAGAS, MTEB, HELM, MCP Schema Validator, llm-stats, llguidance, GlitchMiner, etc.). Caja de búsqueda matchea pain, scenario, y nombre de herramienta. <em>Caso de uso</em>: 'tengo problema X — ¿lo resuelve tafagent, y si no, quién?'",
|
|
|
|
| 2650 |
"help.v083.peft.title": "🔧 Vérificateur d'anti-patterns PEFT",
|
| 2651 |
"help.v083.peft.body": "Le <code>get_peft_model(base, config)</code> de PEFT crée un NOUVEL adaptateur — il ne charge pas les poids sauvegardés depuis un chemin. Quiconque colle du code de tuto et essaie de reprendre depuis un checkpoint jette silencieusement son entraînement. peft #2115 contient le bug report canonique. Ce linter scanne votre script à la recherche du pattern + 3 problèmes liés (ordre QLoRA, mismatch target_modules/arch, ratio lora_alpha) et rapporte les découvertes avec numéros de ligne et corrections suggérées. <em>Cas d'usage</em> : avant de lancer un fine-tune LoRA de 10 heures, collez votre script — attrapez les bugs silencieux en 200ms.",
|
| 2652 |
|
| 2653 |
+
// v0.8.4 — anti-bullshit pack #10: Prompt-Cache Diff Predictor
|
| 2654 |
+
"modes.cache": "🔁 Cache Diff",
|
| 2655 |
+
"mode_desc.cache": "Prédit si une édition du prompt a gardé le cache prompt du fournisseur vivant ou l'a invalidé. Taux de hit par fournisseur + delta $ vs sans cache.",
|
| 2656 |
+
"cache.title": "🔁 Prédicteur de Diff Prompt-Cache",
|
| 2657 |
+
"cache.tip": "Le <code>cache_control</code> d'Anthropic casse au premier token différent du préfixe marqué. OpenAI auto-cache les préfixes ≥1024 tokens mais invalide à tout changement. Le context cache Gemini requiert ≥32K tokens. Une édition mal placée 10x silencieusement votre facture — et l'API ne prévient jamais. Collez ancien + nouveau prompt, voyez le taux de hit par fournisseur + delta de coût.",
|
| 2658 |
+
"cache.desc": "<strong>Ne 10x pas votre facture sur une édition d'un caractère.</strong> Collez votre prompt précédent et actuel — le prédicteur trouve le plus long préfixe commun, estime les tokens, et montre le taux de hit par fournisseur + delta $ vs sans cache.",
|
| 2659 |
+
"cache.old_label": "Ancien prompt :",
|
| 2660 |
+
"cache.new_label": "Nouveau prompt :",
|
| 2661 |
+
"cache.old.placeholder": "Vous êtes un assistant utile. …",
|
| 2662 |
+
"cache.new.placeholder": "Vous êtes un assistant utile. …",
|
| 2663 |
+
"cache.profile_label": "Profil de tokenizer :",
|
| 2664 |
+
"cache.profile.english": "Anglais (chars/4)",
|
| 2665 |
+
"cache.profile.code": "Code (chars/3.5)",
|
| 2666 |
+
"cache.profile.mixed": "CJK / Cyrillique (chars/2)",
|
| 2667 |
+
"cache.output_label": "Tokens de sortie estimés :",
|
| 2668 |
+
"cache.diff_btn": "🔍 Prédire",
|
| 2669 |
+
"cache.example_good_btn": "↳ Exemple : 99% hit",
|
| 2670 |
+
"cache.example_broken_btn": "↳ Exemple : cache cassé",
|
| 2671 |
+
"cache.example_belowmin_btn": "↳ Exemple : sous le minimum OpenAI",
|
| 2672 |
+
"cache.status.done": "✅ {verdict} — {hit}% hit théorique",
|
| 2673 |
+
"cache.verdict.identical": "✅ Identiques — hit complet",
|
| 2674 |
+
"cache.verdict.divergent_can_cache":"⚠ Hit partiel — varie selon fournisseur",
|
| 2675 |
+
"cache.verdict.divergent_below_min":"❌ En dessous des minimums — pas de cache possible",
|
| 2676 |
+
"cache.verdict.fully_divergent": "❌ Totalement divergents — cache invalidé",
|
| 2677 |
+
"cache.verdict.empty_input": "ℹ Entrée vide",
|
| 2678 |
+
"cache.summary.tokens": "Préfixe commun {common} / {total} tokens (taux de hit théorique {pct}%).",
|
| 2679 |
+
"cache.summary.diff_at": "Première différence à la ligne {line}.",
|
| 2680 |
+
"cache.col.provider": "Fournisseur",
|
| 2681 |
+
"cache.col.hit": "Hit",
|
| 2682 |
+
"cache.col.cost": "Base → cached",
|
| 2683 |
+
"cache.col.savings": "Économies",
|
| 2684 |
+
"cache.note.requires_marker": "(nécessite le marqueur cache_control)",
|
| 2685 |
+
"cache.note.below_min": "(préfixe < {min} tokens — min du fournisseur)",
|
| 2686 |
+
"cache.write_surcharge": "+ {cost} surcharge cache-write la première fois (Anthropic)",
|
| 2687 |
+
"cache.diff.title": "Où le cache casse",
|
| 2688 |
+
"cache.diff.legend": "Vert = préfixe partagé (cacheable). Rouge = première édition (tout à partir d'ici est re-facturé).",
|
| 2689 |
+
"cache.hint.empty": "Collez deux prompts, puis Prédire.",
|
| 2690 |
+
"cache.attribution": "Réfs :",
|
| 2691 |
+
"cache.attribution.snapshot": "Prix snapshot 2026-01 ; vérifiez avec la doc actuelle du fournisseur avant d'agir sur $.",
|
| 2692 |
+
"inv.v084.cache": "<strong>🔁 Cache Diff</strong> — prédit si une édition du prompt a invalidé le cache prompt du fournisseur. Taux de hit par fournisseur + delta $.",
|
| 2693 |
+
"help.v084.cache.title": "🔁 Prédicteur de Diff Prompt-Cache",
|
| 2694 |
+
"help.v084.cache.body": "Les caches prompt de chaque fournisseur ont des règles différentes : le <code>cache_control</code> d'Anthropic casse au premier token différent du préfixe marqué ; OpenAI auto-cache les préfixes ≥1024 tokens ; les context caches Gemini requièrent ≥32K tokens. Une édition mal placée 10x silencieusement votre facture — l'API ne prévient pas, et le coût n'apparaît qu'à la facture suivante. Collez ancien + nouveau prompt, le prédicteur trouve le plus long préfixe commun, estime les tokens avec trois profils de tokenizer (anglais / code / CJK), et montre le taux de hit par fournisseur + delta $ vs sans cache pour Claude Opus/Sonnet/Haiku, GPT-5/mini, et Gemini 2.5 Pro. <em>Cas d'usage</em> : 'J'ai modifié le system prompt et la facture a sauté — qu'est-ce qui a cassé ?' → collez les deux prompts, voyez exactement quel fournisseur a arrêté de cacher.",
|
| 2695 |
+
|
| 2696 |
"inv.v081.hub": "<strong>🧭 Solutions Hub</strong> — chaque pain documenté mappé à un mode tafagent ou outil externe curé. Ne réinventez pas — trouvez.",
|
| 2697 |
"help.v081.hub.title": "🧭 Solutions Hub",
|
| 2698 |
"help.v081.hub.body": "tafagent comme intégrateur, pas silo. 30+ pains à travers 7 catégories (eval reliability · diagnostics · setup · training · retrieval · multimodal · observability), chacun mappé à (a) le mode tafagent qui le résout, s'il existe, et (b) les outils externes best-of-breed que la communauté utilise déjà (RAGAS, MTEB, HELM, MCP Schema Validator, llm-stats, llguidance, GlitchMiner, etc.). La barre de recherche matche pain, scénario, et nom d'outil. <em>Cas d'usage</em> : 'j'ai le problème X — tafagent le résout-il, et sinon, qui ?'",
|
|
|
|
| 3610 |
"help.v083.peft.title": "🔧 PEFT 反模式检查器",
|
| 3611 |
"help.v083.peft.body": "PEFT 的 <code>get_peft_model(base, config)</code> 创建一个新的 adapter——它不从路径加载已保存的权重。粘贴教程代码并尝试从 checkpoint 恢复的人会静默地丢掉训练。peft #2115 是规范的 bug 报告。这个 linter 扫描你的脚本查找该模式 + 3 个相关问题(QLoRA 顺序、target_modules/架构不匹配、lora_alpha 比率),并报告带行号和建议修复的发现。<em>用例</em>:在启动 10 小时的 LoRA fine-tune 之前,粘贴你的脚本——在 200ms 内捕获静默 bug。",
|
| 3612 |
|
| 3613 |
+
// v0.8.4 — anti-bullshit pack #10: Prompt-Cache Diff Predictor
|
| 3614 |
+
"modes.cache": "🔁 缓存差异",
|
| 3615 |
+
"mode_desc.cache": "预测 prompt 编辑是否保留了提供商的 prompt cache 还是使其失效。每个提供商的命中率 + 与无缓存的 $ 差额。",
|
| 3616 |
+
"cache.title": "🔁 Prompt-Cache 差异预测器",
|
| 3617 |
+
"cache.tip": "Anthropic 的 <code>cache_control</code> 缓存在标记前缀的第一个 token 差异处中断。OpenAI 自动缓存 ≥1024 token 的前缀,但任何更改都会使其失效。Gemini context cache 需要 ≥32K token。位置不当的编辑会悄悄使你的账单 10 倍——API 永远不会警告。粘贴新旧 prompt,查看每个提供商的命中率 + 成本差额。",
|
| 3618 |
+
"cache.desc": "<strong>不要因一个字符的编辑使账单 10 倍。</strong> 粘贴你之前和当前的 prompt——预测器找到最长公共前缀,估算 token,并显示每个提供商的命中率 + 与无缓存的 $ 差额。",
|
| 3619 |
+
"cache.old_label": "旧 prompt:",
|
| 3620 |
+
"cache.new_label": "新 prompt:",
|
| 3621 |
+
"cache.old.placeholder": "你是一个有帮助的助手。…",
|
| 3622 |
+
"cache.new.placeholder": "你是一个有帮助的助手。…",
|
| 3623 |
+
"cache.profile_label": "Tokenizer 配置:",
|
| 3624 |
+
"cache.profile.english": "英语(chars/4)",
|
| 3625 |
+
"cache.profile.code": "代码(chars/3.5)",
|
| 3626 |
+
"cache.profile.mixed": "中日韩 / 西里尔(chars/2)",
|
| 3627 |
+
"cache.output_label": "估计输出 token:",
|
| 3628 |
+
"cache.diff_btn": "🔍 预测",
|
| 3629 |
+
"cache.example_good_btn": "↳ 示例:99% 命中",
|
| 3630 |
+
"cache.example_broken_btn": "↳ 示例:缓存失效",
|
| 3631 |
+
"cache.example_belowmin_btn": "↳ 示例:低于 OpenAI 最小值",
|
| 3632 |
+
"cache.status.done": "✅ {verdict} — {hit}% 理论命中",
|
| 3633 |
+
"cache.verdict.identical": "✅ 完全相同——完整命中",
|
| 3634 |
+
"cache.verdict.divergent_can_cache":"⚠ 部分命中——按提供商不同",
|
| 3635 |
+
"cache.verdict.divergent_below_min":"❌ 低于所有提供商最小值——无法缓存",
|
| 3636 |
+
"cache.verdict.fully_divergent": "❌ 完全不同——缓存失效",
|
| 3637 |
+
"cache.verdict.empty_input": "ℹ 空输入",
|
| 3638 |
+
"cache.summary.tokens": "公共前缀 {common} / {total} token({pct}% 理论命中率)。",
|
| 3639 |
+
"cache.summary.diff_at": "第一个差异在第 {line} 行。",
|
| 3640 |
+
"cache.col.provider": "提供商",
|
| 3641 |
+
"cache.col.hit": "命中",
|
| 3642 |
+
"cache.col.cost": "基础 → 缓存",
|
| 3643 |
+
"cache.col.savings": "节省",
|
| 3644 |
+
"cache.note.requires_marker": "(需要 cache_control 标记)",
|
| 3645 |
+
"cache.note.below_min": "(前缀 < {min} token——提供商最小值)",
|
| 3646 |
+
"cache.write_surcharge": "+ {cost} 首次缓存写入附加费(Anthropic)",
|
| 3647 |
+
"cache.diff.title": "缓存在哪里中断",
|
| 3648 |
+
"cache.diff.legend": "绿色 = 共享前缀(可缓存)。红色 = 首次编辑(从这里开始全部重新计费)。",
|
| 3649 |
+
"cache.hint.empty": "粘贴两个 prompt,然后预测。",
|
| 3650 |
+
"cache.attribution": "参考:",
|
| 3651 |
+
"cache.attribution.snapshot": "价格快照 2026-01;在按 $ 行动前请用提供商当前文档验证。",
|
| 3652 |
+
"inv.v084.cache": "<strong>🔁 缓存差异</strong> — 预测 prompt 编辑是否使提供商的 prompt cache 失效。每个提供商的命中率 + $ 差额。",
|
| 3653 |
+
"help.v084.cache.title": "🔁 Prompt-Cache 差异预测器",
|
| 3654 |
+
"help.v084.cache.body": "每个提供商的 prompt cache 有不同规则:Anthropic 的 <code>cache_control</code> 在标记前缀的第一个 token 差异处中断;OpenAI 自动缓存 ≥1024 token 的前缀;Gemini context cache 需要 ≥32K token。位置不当的编辑会悄悄使你的账单 10 倍——API 不会警告,成本只在下张账单上出现。粘贴新旧 prompt,预测器找到最长公共前缀,用三种 tokenizer 配置(英语/代码/CJK)估算 token,并显示每个提供商的命中率 + 与无缓存的 $ 差额,包括 Claude Opus/Sonnet/Haiku、GPT-5/mini 和 Gemini 2.5 Pro。<em>用例</em>:『我调整了 system prompt 后账单暴涨——什么坏了?』→ 粘贴两个 prompt,看到底哪个提供商停止缓存。",
|
| 3655 |
+
|
| 3656 |
"inv.v081.hub": "<strong>🧭 Solutions Hub</strong> — 每个文档化的问题都映射到一个 tafagent 模式或精选外部工具。别重复发明 — 去找。",
|
| 3657 |
"help.v081.hub.title": "🧭 Solutions Hub",
|
| 3658 |
"help.v081.hub.body": "tafagent 作为集成者而非孤岛。30+ 问题跨 7 类别(评估可靠性 · 诊断 · 设置 · 训练 · 检索 · 多模态 · 可观测性),每个映射到(a)解决它的 tafagent 模式(若存在),以及(b)社区已信任的最佳外部工具(RAGAS、MTEB、HELM、MCP Schema Validator、llm-stats、llguidance、GlitchMiner 等)。搜索框匹配 pain、场景和工具名称。<em>用例</em>:'我有问题 X — tafagent 解决它吗,如果不,谁解决?'",
|
|
@@ -29,6 +29,7 @@ import {
|
|
| 29 |
} from "./solutions_hub.js";
|
| 30 |
import { lintJsonCot, reorderJsonText, classifyFieldName } from "./json_cot_linter.js";
|
| 31 |
import { lintPeftCode, ARCH_TARGET_MODULES } from "./peft_anti_pattern.js";
|
|
|
|
| 32 |
|
| 33 |
// Attach HF Hub search-as-you-type to all 5 model id inputs (Profile, Recipe,
|
| 34 |
// Unmask, Template, Quant). Hits public huggingface.co/api/models. Idempotent.
|
|
@@ -220,6 +221,7 @@ document.addEventListener("click", (e) => {
|
|
| 220 |
saturation: "saturation-section",
|
| 221 |
cot: "cot-section",
|
| 222 |
peft: "peft-section",
|
|
|
|
| 223 |
hub: "hub-section",
|
| 224 |
}[targetMode];
|
| 225 |
if (sectionId) {
|
|
@@ -245,7 +247,7 @@ document.querySelectorAll(".mode-btn").forEach(btn => {
|
|
| 245 |
"diagnose-section", "phase-section", "unmask-section",
|
| 246 |
"template-section", "arena-section", "contam-section",
|
| 247 |
"quant-section", "drift-section", "niah-section",
|
| 248 |
-
"saturation-section", "cot-section", "peft-section", "hub-section"].forEach(id => {
|
| 249 |
const el = $(id);
|
| 250 |
if (el) el.style.display = "none";
|
| 251 |
});
|
|
@@ -259,6 +261,7 @@ document.querySelectorAll(".mode-btn").forEach(btn => {
|
|
| 259 |
saturation: "saturation-section",
|
| 260 |
cot: "cot-section",
|
| 261 |
peft: "peft-section",
|
|
|
|
| 262 |
hub: "hub-section",
|
| 263 |
};
|
| 264 |
const sectionId = sectionMap[mode];
|
|
@@ -268,6 +271,7 @@ document.querySelectorAll(".mode-btn").forEach(btn => {
|
|
| 268 |
if (mode === "saturation") initSaturation();
|
| 269 |
if (mode === "cot") initCot();
|
| 270 |
if (mode === "peft") initPeft();
|
|
|
|
| 271 |
if (mode === "hub") initHub();
|
| 272 |
});
|
| 273 |
});
|
|
@@ -3712,6 +3716,200 @@ $("peft-example-clean-btn")?.addEventListener("click", () => {
|
|
| 3712 |
runPeftLint();
|
| 3713 |
});
|
| 3714 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3715 |
// ════════════════════════════════════════════════════════════════════
|
| 3716 |
// Bootstrap
|
| 3717 |
// ════════════════════════════════════════════════════════════════════
|
|
|
|
| 29 |
} from "./solutions_hub.js";
|
| 30 |
import { lintJsonCot, reorderJsonText, classifyFieldName } from "./json_cot_linter.js";
|
| 31 |
import { lintPeftCode, ARCH_TARGET_MODULES } from "./peft_anti_pattern.js";
|
| 32 |
+
import { diffPromptCache, PROVIDERS as CACHE_PROVIDERS } from "./prompt_cache_diff.js";
|
| 33 |
|
| 34 |
// Attach HF Hub search-as-you-type to all 5 model id inputs (Profile, Recipe,
|
| 35 |
// Unmask, Template, Quant). Hits public huggingface.co/api/models. Idempotent.
|
|
|
|
| 221 |
saturation: "saturation-section",
|
| 222 |
cot: "cot-section",
|
| 223 |
peft: "peft-section",
|
| 224 |
+
cache: "cache-section",
|
| 225 |
hub: "hub-section",
|
| 226 |
}[targetMode];
|
| 227 |
if (sectionId) {
|
|
|
|
| 247 |
"diagnose-section", "phase-section", "unmask-section",
|
| 248 |
"template-section", "arena-section", "contam-section",
|
| 249 |
"quant-section", "drift-section", "niah-section",
|
| 250 |
+
"saturation-section", "cot-section", "peft-section", "cache-section", "hub-section"].forEach(id => {
|
| 251 |
const el = $(id);
|
| 252 |
if (el) el.style.display = "none";
|
| 253 |
});
|
|
|
|
| 261 |
saturation: "saturation-section",
|
| 262 |
cot: "cot-section",
|
| 263 |
peft: "peft-section",
|
| 264 |
+
cache: "cache-section",
|
| 265 |
hub: "hub-section",
|
| 266 |
};
|
| 267 |
const sectionId = sectionMap[mode];
|
|
|
|
| 271 |
if (mode === "saturation") initSaturation();
|
| 272 |
if (mode === "cot") initCot();
|
| 273 |
if (mode === "peft") initPeft();
|
| 274 |
+
if (mode === "cache") initCacheDiff();
|
| 275 |
if (mode === "hub") initHub();
|
| 276 |
});
|
| 277 |
});
|
|
|
|
| 3716 |
runPeftLint();
|
| 3717 |
});
|
| 3718 |
|
| 3719 |
+
// ════════════════════════════════════════════════════════════════════
|
| 3720 |
+
// 🔁 Prompt-Cache Diff Predictor (v0.8.4 anti-bullshit pack #10)
|
| 3721 |
+
// ════════════════════════════════════════════════════════════════════
|
| 3722 |
+
const CACHE_VERDICT_BG = {
|
| 3723 |
+
identical: "#3fb950",
|
| 3724 |
+
divergent_can_cache: "#d29922",
|
| 3725 |
+
divergent_below_min: "#f0883e",
|
| 3726 |
+
fully_divergent: "#f85149",
|
| 3727 |
+
empty_input: "#8b949e",
|
| 3728 |
+
};
|
| 3729 |
+
|
| 3730 |
+
let __cacheInited = false;
|
| 3731 |
+
|
| 3732 |
+
function initCacheDiff() {
|
| 3733 |
+
if (__cacheInited) return;
|
| 3734 |
+
__cacheInited = true;
|
| 3735 |
+
// No-op (no async data); placeholder kept for symmetry.
|
| 3736 |
+
}
|
| 3737 |
+
|
| 3738 |
+
function fmtUsd(n) {
|
| 3739 |
+
if (n == null || isNaN(n)) return "—";
|
| 3740 |
+
if (n === 0) return "$0";
|
| 3741 |
+
if (n < 0.01) return `$${n.toFixed(6)}`;
|
| 3742 |
+
if (n < 1) return `$${n.toFixed(4)}`;
|
| 3743 |
+
return `$${n.toFixed(2)}`;
|
| 3744 |
+
}
|
| 3745 |
+
|
| 3746 |
+
function fmtPct(n) {
|
| 3747 |
+
if (n == null || isNaN(n)) return "—";
|
| 3748 |
+
return `${Math.round(n * 100)}%`;
|
| 3749 |
+
}
|
| 3750 |
+
|
| 3751 |
+
function renderCacheProvider(p) {
|
| 3752 |
+
const bgRow = p.reason === "below_min" ? "#21262d" : "#161b22";
|
| 3753 |
+
const noteHtml = [];
|
| 3754 |
+
if (p.requires_explicit && p.reason !== "below_min") {
|
| 3755 |
+
noteHtml.push(`<span class="subtle" style="font-size:0.8em;">${t("cache.note.requires_marker") || "(requires cache_control marker)"}</span>`);
|
| 3756 |
+
}
|
| 3757 |
+
if (p.reason === "below_min") {
|
| 3758 |
+
noteHtml.push(`<span class="subtle" style="font-size:0.8em;color:#f0883e;">${tFmt("cache.note.below_min", { min: p.min_cache_tokens.toLocaleString() }) || `(prefix < ${p.min_cache_tokens.toLocaleString()} tokens — provider min)`}</span>`);
|
| 3759 |
+
}
|
| 3760 |
+
const noteCell = noteHtml.length ? `<br>${noteHtml.join(" ")}` : "";
|
| 3761 |
+
|
| 3762 |
+
const ttlMin = p.cache_ttl_seconds >= 3600
|
| 3763 |
+
? `${Math.round(p.cache_ttl_seconds / 3600)}h`
|
| 3764 |
+
: `${Math.round(p.cache_ttl_seconds / 60)}min`;
|
| 3765 |
+
|
| 3766 |
+
const savingsColor = p.savings_usd > 0 ? "#3fb950" : (p.reason ? "#8b949e" : "#d29922");
|
| 3767 |
+
const writeRow = p.cache_write_surcharge_usd && p.cache_write_surcharge_usd > 0
|
| 3768 |
+
? `<tr style="background:${bgRow};"><td colspan="4" class="subtle" style="font-size:0.8em;padding-left:1em;">${tFmt("cache.write_surcharge", { cost: fmtUsd(p.cache_write_surcharge_usd) }) || `+ ${fmtUsd(p.cache_write_surcharge_usd)} cache-write surcharge first time (Anthropic)`}</td></tr>`
|
| 3769 |
+
: "";
|
| 3770 |
+
|
| 3771 |
+
return `
|
| 3772 |
+
<tr style="background:${bgRow};">
|
| 3773 |
+
<td><strong>${escapeHtml(p.provider_name)}</strong>${noteCell}<br><span class="subtle" style="font-size:0.78em;">TTL ${ttlMin}</span></td>
|
| 3774 |
+
<td style="text-align:right;">${fmtPct(p.hit_ratio)}</td>
|
| 3775 |
+
<td style="text-align:right;">${fmtUsd(p.base_cost_usd)} → ${fmtUsd(p.cached_cost_usd)}</td>
|
| 3776 |
+
<td style="text-align:right;color:${savingsColor};"><strong>${fmtUsd(p.savings_usd)}</strong> (${fmtPct(p.savings_pct ?? 0)})</td>
|
| 3777 |
+
</tr>
|
| 3778 |
+
${writeRow}
|
| 3779 |
+
`;
|
| 3780 |
+
}
|
| 3781 |
+
|
| 3782 |
+
function renderCacheDiffVisualization(oldText, newText, lcpChars) {
|
| 3783 |
+
// Truncate context — show last 200 chars of common prefix, and the
|
| 3784 |
+
// first 200 chars of each diverging suffix. Keeps UI tight.
|
| 3785 |
+
const ctxBefore = 200;
|
| 3786 |
+
const startCommon = Math.max(0, lcpChars - ctxBefore);
|
| 3787 |
+
const commonTail = oldText.slice(startCommon, lcpChars);
|
| 3788 |
+
const oldDiv = oldText.slice(lcpChars);
|
| 3789 |
+
const newDiv = newText.slice(lcpChars);
|
| 3790 |
+
const commonLeader = startCommon > 0 ? "…" : "";
|
| 3791 |
+
|
| 3792 |
+
return `
|
| 3793 |
+
<details style="margin-top:1em;">
|
| 3794 |
+
<summary style="cursor:pointer;"><strong>${t("cache.diff.title") || "Where the cache breaks"}</strong></summary>
|
| 3795 |
+
<div style="background:#0d1117;padding:0.75em;border-radius:4px;font-family:monospace;font-size:0.85em;line-height:1.4;overflow-x:auto;white-space:pre-wrap;">
|
| 3796 |
+
<span style="color:#3fb950;">${escapeHtml(commonLeader + commonTail)}</span><span style="color:#f85149;text-decoration:underline;">${escapeHtml(oldDiv.slice(0, 200))}</span><span class="subtle"> ← old</span>
|
| 3797 |
+
<span style="color:#3fb950;">${escapeHtml(commonLeader + commonTail)}</span><span style="color:#3fb950;text-decoration:underline;">${escapeHtml(newDiv.slice(0, 200))}</span><span class="subtle"> ← new</span>
|
| 3798 |
+
</div>
|
| 3799 |
+
<p class="subtle" style="font-size:0.82em;">${t("cache.diff.legend") || "Green = shared prefix (cacheable). Red = first edit (everything from here is re-billed)."}</p>
|
| 3800 |
+
</details>
|
| 3801 |
+
`;
|
| 3802 |
+
}
|
| 3803 |
+
|
| 3804 |
+
function renderCacheResult(result, oldText, newText) {
|
| 3805 |
+
const verdict = t(`cache.verdict.${result.code}`) || result.code;
|
| 3806 |
+
const verdictBg = CACHE_VERDICT_BG[result.code] || "#8b949e";
|
| 3807 |
+
const verdictBadge = `<span class="badge" style="background:${verdictBg};">${verdict}</span>`;
|
| 3808 |
+
|
| 3809 |
+
if (result.code === "empty_input") {
|
| 3810 |
+
return `<div class="arena-result">
|
| 3811 |
+
<p style="font-size:1.1em;">${verdictBadge}</p>
|
| 3812 |
+
<p class="recipe-desc">${t("cache.hint.empty") || "Paste two prompts, then Predict."}</p>
|
| 3813 |
+
</div>`;
|
| 3814 |
+
}
|
| 3815 |
+
|
| 3816 |
+
const p = result.params;
|
| 3817 |
+
const summary = `
|
| 3818 |
+
<p class="recipe-desc">
|
| 3819 |
+
${tFmt("cache.summary.tokens", { common: p.tokens_common.toLocaleString(), total: p.tokens_total.toLocaleString(), pct: Math.round(p.hit_ratio * 100) })
|
| 3820 |
+
|| `Common prefix ${p.tokens_common.toLocaleString()} / ${p.tokens_total.toLocaleString()} tokens (${Math.round(p.hit_ratio * 100)}% theoretical hit ratio).`}
|
| 3821 |
+
</p>
|
| 3822 |
+
<p class="recipe-desc subtle">
|
| 3823 |
+
${tFmt("cache.summary.diff_at", { line: p.diff_point.line }) || `First difference at line ${p.diff_point.line}.`}
|
| 3824 |
+
</p>
|
| 3825 |
+
`;
|
| 3826 |
+
|
| 3827 |
+
const rows = (result.providers || []).map(renderCacheProvider).join("");
|
| 3828 |
+
const table = rows ? `
|
| 3829 |
+
<table class="lean-table" style="margin-top:1em;width:100%;">
|
| 3830 |
+
<thead><tr>
|
| 3831 |
+
<th style="text-align:left;">${t("cache.col.provider") || "Provider"}</th>
|
| 3832 |
+
<th style="text-align:right;">${t("cache.col.hit") || "Hit"}</th>
|
| 3833 |
+
<th style="text-align:right;">${t("cache.col.cost") || "Base → cached"}</th>
|
| 3834 |
+
<th style="text-align:right;">${t("cache.col.savings") || "Savings"}</th>
|
| 3835 |
+
</tr></thead>
|
| 3836 |
+
<tbody>${rows}</tbody>
|
| 3837 |
+
</table>
|
| 3838 |
+
` : "";
|
| 3839 |
+
|
| 3840 |
+
const diffViz = result.code !== "identical"
|
| 3841 |
+
? renderCacheDiffVisualization(oldText, newText, p.lcp_chars)
|
| 3842 |
+
: "";
|
| 3843 |
+
|
| 3844 |
+
const attribution = `
|
| 3845 |
+
<p class="recipe-desc subtle" style="font-size:0.82em;margin-top:1em;">
|
| 3846 |
+
${t("cache.attribution") || "Refs:"}
|
| 3847 |
+
<a href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching" target="_blank" rel="noopener noreferrer">Anthropic prompt caching</a> ·
|
| 3848 |
+
<a href="https://platform.openai.com/docs/guides/prompt-caching" target="_blank" rel="noopener noreferrer">OpenAI prompt caching</a> ·
|
| 3849 |
+
<a href="https://ai.google.dev/gemini-api/docs/caching" target="_blank" rel="noopener noreferrer">Gemini context caching</a>
|
| 3850 |
+
<br><em>${t("cache.attribution.snapshot") || "Prices snapshot 2026-01; verify against current provider docs before acting on $."}</em>
|
| 3851 |
+
</p>
|
| 3852 |
+
`;
|
| 3853 |
+
|
| 3854 |
+
return `<div class="arena-result">
|
| 3855 |
+
<p style="font-size:1.1em;">${verdictBadge}</p>
|
| 3856 |
+
${summary}
|
| 3857 |
+
${table}
|
| 3858 |
+
${diffViz}
|
| 3859 |
+
${attribution}
|
| 3860 |
+
</div>`;
|
| 3861 |
+
}
|
| 3862 |
+
|
| 3863 |
+
function runCacheDiff() {
|
| 3864 |
+
const oldText = $("cache-old")?.value || "";
|
| 3865 |
+
const newText = $("cache-new")?.value || "";
|
| 3866 |
+
const profile = $("cache-profile")?.value || "english";
|
| 3867 |
+
const outputTokens = parseInt($("cache-output-tokens")?.value || "500", 10);
|
| 3868 |
+
|
| 3869 |
+
const result = diffPromptCache(oldText, newText, {
|
| 3870 |
+
profile,
|
| 3871 |
+
outputTokensEstimate: outputTokens,
|
| 3872 |
+
});
|
| 3873 |
+
$("cache-output").innerHTML = renderCacheResult(result, oldText, newText);
|
| 3874 |
+
$("cache-status").textContent = tFmt("cache.status.done", {
|
| 3875 |
+
verdict: t(`cache.verdict.${result.code}`) || result.code,
|
| 3876 |
+
hit: Math.round((result.params?.hit_ratio || 0) * 100),
|
| 3877 |
+
});
|
| 3878 |
+
}
|
| 3879 |
+
|
| 3880 |
+
const CACHE_LONG_SYS = "You are a helpful, harmless, and honest assistant. " +
|
| 3881 |
+
"Always cite your sources. ".repeat(40) +
|
| 3882 |
+
"Always show your reasoning step by step. ".repeat(40) +
|
| 3883 |
+
"Be concise. Format code with backticks. ".repeat(40) +
|
| 3884 |
+
"\n\nUser tools available:\n- search\n- calculator\n- code_runner\n";
|
| 3885 |
+
|
| 3886 |
+
const CACHE_EXAMPLE_GOOD_OLD = CACHE_LONG_SYS + "\nUser: What is 2 + 2?";
|
| 3887 |
+
const CACHE_EXAMPLE_GOOD_NEW = CACHE_LONG_SYS + "\nUser: What is 2 + 3?";
|
| 3888 |
+
|
| 3889 |
+
const CACHE_EXAMPLE_BROKEN_OLD = CACHE_LONG_SYS.replace("helpful, harmless, and honest", "helpful AND honest")
|
| 3890 |
+
+ "\nUser: What is 2 + 2?";
|
| 3891 |
+
const CACHE_EXAMPLE_BROKEN_NEW = CACHE_LONG_SYS + "\nUser: What is 2 + 2?";
|
| 3892 |
+
|
| 3893 |
+
const CACHE_EXAMPLE_BELOWMIN_OLD = "Q: name 3 colors";
|
| 3894 |
+
const CACHE_EXAMPLE_BELOWMIN_NEW = "Q: name 4 colors";
|
| 3895 |
+
|
| 3896 |
+
$("cache-diff-btn")?.addEventListener("click", runCacheDiff);
|
| 3897 |
+
$("cache-example-good-btn")?.addEventListener("click", () => {
|
| 3898 |
+
$("cache-old").value = CACHE_EXAMPLE_GOOD_OLD;
|
| 3899 |
+
$("cache-new").value = CACHE_EXAMPLE_GOOD_NEW;
|
| 3900 |
+
runCacheDiff();
|
| 3901 |
+
});
|
| 3902 |
+
$("cache-example-broken-btn")?.addEventListener("click", () => {
|
| 3903 |
+
$("cache-old").value = CACHE_EXAMPLE_BROKEN_OLD;
|
| 3904 |
+
$("cache-new").value = CACHE_EXAMPLE_BROKEN_NEW;
|
| 3905 |
+
runCacheDiff();
|
| 3906 |
+
});
|
| 3907 |
+
$("cache-example-belowmin-btn")?.addEventListener("click", () => {
|
| 3908 |
+
$("cache-old").value = CACHE_EXAMPLE_BELOWMIN_OLD;
|
| 3909 |
+
$("cache-new").value = CACHE_EXAMPLE_BELOWMIN_NEW;
|
| 3910 |
+
runCacheDiff();
|
| 3911 |
+
});
|
| 3912 |
+
|
| 3913 |
// ════════════════════════════════════════════════════════════════════
|
| 3914 |
// Bootstrap
|
| 3915 |
// ════════════════════════════════════════════════════════════════════
|
|
@@ -0,0 +1,308 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
// Prompt-Cache Diff Predictor (v0.8.4 anti-bullshit pack #10)
|
| 2 |
+
//
|
| 3 |
+
// Pain: small prompt edits silently invalidate provider prompt caches,
|
| 4 |
+
// turning a 50% discount into a 0% discount and 10x'ing the bill.
|
| 5 |
+
// Users debug this blind because:
|
| 6 |
+
// - Anthropic's `cache_control` cache breaks at the first token diff
|
| 7 |
+
// in the marked prefix (TTL 5 min default, 1 hour beta).
|
| 8 |
+
// - OpenAI auto-caches prefixes ≥1024 tokens but invalidates on any
|
| 9 |
+
// prefix change; the 50% read discount only applies on hit.
|
| 10 |
+
// - Gemini's context cache requires explicit creation, ≥32K tokens,
|
| 11 |
+
// and any prefix edit forces a new cache.
|
| 12 |
+
//
|
| 13 |
+
// Tool: paste old + new prompt → compute longest common prefix in
|
| 14 |
+
// tokens → predict per-provider cache hit ratio + $ delta vs no-cache.
|
| 15 |
+
//
|
| 16 |
+
// Pure logic — no human strings; main.js does i18n. Returns
|
| 17 |
+
// {code, params, providers: [{provider_id, ...}]}.
|
| 18 |
+
|
| 19 |
+
// =============================================================================
|
| 20 |
+
// Token estimation — heuristic, browser-only
|
| 21 |
+
// =============================================================================
|
| 22 |
+
//
|
| 23 |
+
// Real tokenizers vary by ±15% between Llama / GPT / Claude / Qwen and
|
| 24 |
+
// running them in-browser would mean shipping a 5-10 MB WASM blob. For a
|
| 25 |
+
// cache-diff predictor the absolute count doesn't matter — what matters
|
| 26 |
+
// is the RATIO of common-prefix to divergent-suffix tokens, which is
|
| 27 |
+
// robust to estimator choice. The three profiles below cover 95% of
|
| 28 |
+
// real prompts; users with extreme cases can paste pre-tokenized counts.
|
| 29 |
+
const TOKEN_PROFILES = {
|
| 30 |
+
english: { chars_per_token: 4.0, label_key: "cache.profile.english" },
|
| 31 |
+
code: { chars_per_token: 3.5, label_key: "cache.profile.code" },
|
| 32 |
+
mixed: { chars_per_token: 2.0, label_key: "cache.profile.mixed" }, // CJK / Cyrillic
|
| 33 |
+
};
|
| 34 |
+
|
| 35 |
+
export function estimateTokens(text, profile = "english") {
|
| 36 |
+
if (typeof text !== "string" || !text) return 0;
|
| 37 |
+
const cpt = TOKEN_PROFILES[profile]?.chars_per_token ?? 4.0;
|
| 38 |
+
return Math.ceil(text.length / cpt);
|
| 39 |
+
}
|
| 40 |
+
|
| 41 |
+
// =============================================================================
|
| 42 |
+
// Provider rules — pricing + cache mechanics
|
| 43 |
+
// =============================================================================
|
| 44 |
+
//
|
| 45 |
+
// Prices are USD per million tokens, snapshot 2026-01 (knowledge cutoff).
|
| 46 |
+
// `cache_read_multiplier` is the fraction of input price billed on a
|
| 47 |
+
// cache hit (Anthropic 0.10 = 10%; OpenAI/Gemini 0.50 = 50%; etc).
|
| 48 |
+
// `cache_write_multiplier` accounts for Anthropic's 25% write surcharge
|
| 49 |
+
// the first time a prefix is seen.
|
| 50 |
+
//
|
| 51 |
+
// `min_cache_tokens` is the floor below which the provider cannot cache
|
| 52 |
+
// (OpenAI auto-cache requires ≥1024; Gemini context cache ≥32K).
|
| 53 |
+
// Anthropic has no min token floor but requires explicit cache_control
|
| 54 |
+
// marker — we treat that as min=0 with a `requires_explicit` flag for UI.
|
| 55 |
+
export const PROVIDERS = {
|
| 56 |
+
anthropic_opus: {
|
| 57 |
+
name: "Claude Opus 4.7",
|
| 58 |
+
min_cache_tokens: 0,
|
| 59 |
+
requires_explicit: true,
|
| 60 |
+
cache_ttl_seconds: 300, // 5 min default
|
| 61 |
+
input_per_mt: 15.00,
|
| 62 |
+
output_per_mt: 75.00,
|
| 63 |
+
cache_write_multiplier: 1.25,
|
| 64 |
+
cache_read_multiplier: 0.10, // 10% of input
|
| 65 |
+
},
|
| 66 |
+
anthropic_sonnet: {
|
| 67 |
+
name: "Claude Sonnet 4.6",
|
| 68 |
+
min_cache_tokens: 0,
|
| 69 |
+
requires_explicit: true,
|
| 70 |
+
cache_ttl_seconds: 300,
|
| 71 |
+
input_per_mt: 3.00,
|
| 72 |
+
output_per_mt: 15.00,
|
| 73 |
+
cache_write_multiplier: 1.25,
|
| 74 |
+
cache_read_multiplier: 0.10,
|
| 75 |
+
},
|
| 76 |
+
anthropic_haiku: {
|
| 77 |
+
name: "Claude Haiku 4.5",
|
| 78 |
+
min_cache_tokens: 0,
|
| 79 |
+
requires_explicit: true,
|
| 80 |
+
cache_ttl_seconds: 300,
|
| 81 |
+
input_per_mt: 1.00,
|
| 82 |
+
output_per_mt: 5.00,
|
| 83 |
+
cache_write_multiplier: 1.25,
|
| 84 |
+
cache_read_multiplier: 0.10,
|
| 85 |
+
},
|
| 86 |
+
openai_gpt5: {
|
| 87 |
+
name: "OpenAI GPT-5",
|
| 88 |
+
min_cache_tokens: 1024,
|
| 89 |
+
requires_explicit: false,
|
| 90 |
+
cache_ttl_seconds: 600, // ~5-10 min observed
|
| 91 |
+
input_per_mt: 5.00,
|
| 92 |
+
output_per_mt: 15.00,
|
| 93 |
+
cache_write_multiplier: 1.00,
|
| 94 |
+
cache_read_multiplier: 0.50, // 50% of input
|
| 95 |
+
},
|
| 96 |
+
openai_gpt5_mini: {
|
| 97 |
+
name: "OpenAI GPT-5 mini",
|
| 98 |
+
min_cache_tokens: 1024,
|
| 99 |
+
requires_explicit: false,
|
| 100 |
+
cache_ttl_seconds: 600,
|
| 101 |
+
input_per_mt: 0.30,
|
| 102 |
+
output_per_mt: 1.20,
|
| 103 |
+
cache_write_multiplier: 1.00,
|
| 104 |
+
cache_read_multiplier: 0.50,
|
| 105 |
+
},
|
| 106 |
+
gemini_25_pro: {
|
| 107 |
+
name: "Gemini 2.5 Pro",
|
| 108 |
+
min_cache_tokens: 32768,
|
| 109 |
+
requires_explicit: true,
|
| 110 |
+
cache_ttl_seconds: 3600, // 1 hour default for context cache
|
| 111 |
+
input_per_mt: 1.25,
|
| 112 |
+
output_per_mt: 10.00,
|
| 113 |
+
cache_write_multiplier: 1.00,
|
| 114 |
+
cache_read_multiplier: 0.25, // 25% of input
|
| 115 |
+
},
|
| 116 |
+
};
|
| 117 |
+
|
| 118 |
+
// =============================================================================
|
| 119 |
+
// Longest common prefix — character-level
|
| 120 |
+
// =============================================================================
|
| 121 |
+
|
| 122 |
+
export function longestCommonPrefix(a, b) {
|
| 123 |
+
if (typeof a !== "string" || typeof b !== "string") return 0;
|
| 124 |
+
const n = Math.min(a.length, b.length);
|
| 125 |
+
let i = 0;
|
| 126 |
+
while (i < n && a.charCodeAt(i) === b.charCodeAt(i)) i++;
|
| 127 |
+
return i;
|
| 128 |
+
}
|
| 129 |
+
|
| 130 |
+
// First differing line — useful for the UI "your edit landed here" hint.
|
| 131 |
+
function firstDifferingLine(a, b, prefixLen) {
|
| 132 |
+
// Walk back to the start of the line containing the diff
|
| 133 |
+
let i = prefixLen;
|
| 134 |
+
while (i > 0 && a[i - 1] !== "\n" && b[i - 1] !== "\n") i--;
|
| 135 |
+
// Count line number (1-indexed)
|
| 136 |
+
let line = 1;
|
| 137 |
+
for (let j = 0; j < i; j++) {
|
| 138 |
+
if (a[j] === "\n") line++;
|
| 139 |
+
}
|
| 140 |
+
return { offset: i, line };
|
| 141 |
+
}
|
| 142 |
+
|
| 143 |
+
// =============================================================================
|
| 144 |
+
// Per-provider cache analysis
|
| 145 |
+
// =============================================================================
|
| 146 |
+
|
| 147 |
+
function analyseProvider(
|
| 148 |
+
providerId,
|
| 149 |
+
totalTokensNew,
|
| 150 |
+
commonTokens,
|
| 151 |
+
divergeTokens,
|
| 152 |
+
outputTokens,
|
| 153 |
+
) {
|
| 154 |
+
const p = PROVIDERS[providerId];
|
| 155 |
+
if (!p) return null;
|
| 156 |
+
|
| 157 |
+
const inputPrice = p.input_per_mt / 1_000_000;
|
| 158 |
+
const outputPrice = p.output_per_mt / 1_000_000;
|
| 159 |
+
const baseCost =
|
| 160 |
+
totalTokensNew * inputPrice + outputTokens * outputPrice;
|
| 161 |
+
|
| 162 |
+
// Can the provider cache anything? Two failure modes:
|
| 163 |
+
// (a) common prefix below provider's minimum cacheable size
|
| 164 |
+
// (b) provider requires an explicit marker AND the user almost
|
| 165 |
+
// certainly didn't include one in the paste — we still report
|
| 166 |
+
// the best-case savings but tag the result as `requires_marker`.
|
| 167 |
+
let canCache = true;
|
| 168 |
+
let reason = null;
|
| 169 |
+
if (commonTokens < p.min_cache_tokens) {
|
| 170 |
+
canCache = false;
|
| 171 |
+
reason = "below_min";
|
| 172 |
+
}
|
| 173 |
+
|
| 174 |
+
if (!canCache) {
|
| 175 |
+
return {
|
| 176 |
+
provider_id: providerId,
|
| 177 |
+
provider_name: p.name,
|
| 178 |
+
base_cost_usd: baseCost,
|
| 179 |
+
cached_cost_usd: baseCost,
|
| 180 |
+
savings_usd: 0,
|
| 181 |
+
hit_ratio: 0,
|
| 182 |
+
tokens_cached: 0,
|
| 183 |
+
tokens_billed_input: totalTokensNew,
|
| 184 |
+
reason,
|
| 185 |
+
min_cache_tokens: p.min_cache_tokens,
|
| 186 |
+
requires_explicit: p.requires_explicit,
|
| 187 |
+
cache_ttl_seconds: p.cache_ttl_seconds,
|
| 188 |
+
};
|
| 189 |
+
}
|
| 190 |
+
|
| 191 |
+
// Cost on cache HIT for the prefix:
|
| 192 |
+
// cache-read: commonTokens × inputPrice × cache_read_multiplier
|
| 193 |
+
// fresh: divergeTokens × inputPrice
|
| 194 |
+
// output: outputTokens × outputPrice
|
| 195 |
+
const cachedInputCost =
|
| 196 |
+
commonTokens * inputPrice * p.cache_read_multiplier +
|
| 197 |
+
divergeTokens * inputPrice;
|
| 198 |
+
const cachedCost = cachedInputCost + outputTokens * outputPrice;
|
| 199 |
+
|
| 200 |
+
// Cache write surcharge (Anthropic). Surfaced as `cache_write_cost`
|
| 201 |
+
// separately so users see the amortization picture.
|
| 202 |
+
const cacheWriteSurcharge =
|
| 203 |
+
commonTokens * inputPrice * (p.cache_write_multiplier - 1.0);
|
| 204 |
+
|
| 205 |
+
const savings = baseCost - cachedCost;
|
| 206 |
+
const hitRatio = totalTokensNew === 0 ? 0 : commonTokens / totalTokensNew;
|
| 207 |
+
|
| 208 |
+
return {
|
| 209 |
+
provider_id: providerId,
|
| 210 |
+
provider_name: p.name,
|
| 211 |
+
base_cost_usd: baseCost,
|
| 212 |
+
cached_cost_usd: cachedCost,
|
| 213 |
+
cache_write_surcharge_usd: cacheWriteSurcharge,
|
| 214 |
+
savings_usd: savings,
|
| 215 |
+
savings_pct: baseCost === 0 ? 0 : savings / baseCost,
|
| 216 |
+
hit_ratio: hitRatio,
|
| 217 |
+
tokens_cached: commonTokens,
|
| 218 |
+
tokens_billed_input: divergeTokens,
|
| 219 |
+
reason: null,
|
| 220 |
+
min_cache_tokens: p.min_cache_tokens,
|
| 221 |
+
requires_explicit: p.requires_explicit,
|
| 222 |
+
cache_ttl_seconds: p.cache_ttl_seconds,
|
| 223 |
+
};
|
| 224 |
+
}
|
| 225 |
+
|
| 226 |
+
// =============================================================================
|
| 227 |
+
// Public entry point
|
| 228 |
+
// =============================================================================
|
| 229 |
+
|
| 230 |
+
export function diffPromptCache(
|
| 231 |
+
oldPrompt,
|
| 232 |
+
newPrompt,
|
| 233 |
+
{
|
| 234 |
+
profile = "english",
|
| 235 |
+
outputTokensEstimate = 500,
|
| 236 |
+
providers = null,
|
| 237 |
+
} = {},
|
| 238 |
+
) {
|
| 239 |
+
if (typeof oldPrompt !== "string" || typeof newPrompt !== "string") {
|
| 240 |
+
return { code: "empty_input", params: {} };
|
| 241 |
+
}
|
| 242 |
+
const oldTrim = oldPrompt;
|
| 243 |
+
const newTrim = newPrompt;
|
| 244 |
+
if (!oldTrim && !newTrim) {
|
| 245 |
+
return { code: "empty_input", params: {} };
|
| 246 |
+
}
|
| 247 |
+
|
| 248 |
+
const lcpChars = longestCommonPrefix(oldTrim, newTrim);
|
| 249 |
+
const isIdentical = oldTrim === newTrim;
|
| 250 |
+
const totalCharsNew = newTrim.length;
|
| 251 |
+
const divergeChars = totalCharsNew - lcpChars;
|
| 252 |
+
|
| 253 |
+
const tokensCommon = estimateTokens(oldTrim.slice(0, lcpChars), profile);
|
| 254 |
+
const tokensDiverge = estimateTokens(newTrim.slice(lcpChars), profile);
|
| 255 |
+
const tokensTotal = tokensCommon + tokensDiverge;
|
| 256 |
+
|
| 257 |
+
const providerIds = providers ?? Object.keys(PROVIDERS);
|
| 258 |
+
const providerResults = providerIds
|
| 259 |
+
.map(id => analyseProvider(id, tokensTotal, tokensCommon, tokensDiverge, outputTokensEstimate))
|
| 260 |
+
.filter(r => r !== null);
|
| 261 |
+
|
| 262 |
+
const diffPoint = isIdentical
|
| 263 |
+
? { offset: oldTrim.length, line: oldTrim.split("\n").length }
|
| 264 |
+
: firstDifferingLine(oldTrim, newTrim, lcpChars);
|
| 265 |
+
|
| 266 |
+
let code;
|
| 267 |
+
if (isIdentical) {
|
| 268 |
+
code = "identical";
|
| 269 |
+
} else if (lcpChars === 0) {
|
| 270 |
+
code = "fully_divergent";
|
| 271 |
+
} else if (providerResults.every(r => r.reason === "below_min")) {
|
| 272 |
+
code = "divergent_below_min";
|
| 273 |
+
} else {
|
| 274 |
+
code = "divergent_can_cache";
|
| 275 |
+
}
|
| 276 |
+
|
| 277 |
+
return {
|
| 278 |
+
code,
|
| 279 |
+
params: {
|
| 280 |
+
profile,
|
| 281 |
+
lcp_chars: lcpChars,
|
| 282 |
+
diverge_chars: divergeChars,
|
| 283 |
+
tokens_common: tokensCommon,
|
| 284 |
+
tokens_diverge: tokensDiverge,
|
| 285 |
+
tokens_total: tokensTotal,
|
| 286 |
+
hit_ratio: tokensTotal === 0 ? 0 : tokensCommon / tokensTotal,
|
| 287 |
+
diff_point: diffPoint,
|
| 288 |
+
output_tokens: outputTokensEstimate,
|
| 289 |
+
},
|
| 290 |
+
providers: providerResults,
|
| 291 |
+
};
|
| 292 |
+
}
|
| 293 |
+
|
| 294 |
+
// Helper used by the UI: short summary string per provider, suitable for
|
| 295 |
+
// rendering in a table row (i18n-substituted in main.js).
|
| 296 |
+
export function summariseProvider(result) {
|
| 297 |
+
if (!result) return null;
|
| 298 |
+
return {
|
| 299 |
+
name: result.provider_name,
|
| 300 |
+
hit_pct: Math.round(result.hit_ratio * 100),
|
| 301 |
+
base: result.base_cost_usd,
|
| 302 |
+
cached: result.cached_cost_usd,
|
| 303 |
+
savings: result.savings_usd,
|
| 304 |
+
savings_pct: result.savings_pct ?? 0,
|
| 305 |
+
requires_explicit: result.requires_explicit,
|
| 306 |
+
reason: result.reason,
|
| 307 |
+
};
|
| 308 |
+
}
|