Spaces:
Running
Add `core/plotter.py` and `ui/tab_plot.py` to generate Nature/PRL-standard
Browse filesfigures for Wang's Five Laws directly from the SQLite database.
Update `app.py` to mount the new Tab 5.
---
Pure computation layer. No UI dependency, no DB dependency.
Takes a `pd.DataFrame` (from `db/reader.py`) and returns `matplotlib.Figure`.
- `plot_single_model()` — 4×3 grid (12 subplots), single model
- `plot_compare_models()` — 4×3 grid, Model A (solid) vs Model B (dashed)
- `save_figure()` — exports PNG (300 dpi) + PDF (vector) + SVG (vector)
- `fig_to_plotly()` — converts matplotlib Figure to Plotly for interactive preview
- `fig_to_png_bytes()` — in-memory PNG bytes for Gradio Image component
Gradio Tab 5 interface. Calls `core/plotter.py` and `db/reader.py` only.
- Single-model mode: select model → generate → preview → download
- Two-model comparison mode: select A & B → generate → preview → download
- Controls: modality filter, layer range, band toggle, delta toggle
- Download: PNG / PDF / SVG / ZIP (all formats bundled)
---
- Added `from ui.tab_plot import build_tab_plot`
- Added `build_tab_plot()` call inside `gr.Tabs()` block
---
```
Row 1 — Law 1 & 2 (singular value alignment)
[0,0] pearson_QK [0,1] ssr_QK [0,2] alpha_QK
Row 2 — Law 3 (condition numbers & max singular values)
[1,0] sigma_max_Q [1,1] sigma_max_K [1,2] cond_Q & cond_K (dual line)
Row 3 — Law 4 (output subspace, left singular vectors U)
[2,0] cosU_QK [2,1] cosU_QV [2,2] cosU_KV
+ horizontal dashed: random baseline 1/√d_head
Row 4 — Law 5 (input subspace, right singular vectors V)
[3,0] cosV_QK [3,1] cosV_QV [3,2] cosV_KV
+ horizontal dashed: random baseline 1/√d_model
```
---
**Color system (consistent across all figures)**
| Entity | Color | Hex |
|--------|-------|-----|
| Q | Blue | `#2166AC` |
| K | Red | `#D6604D` |
| V | Green | `#4DAC26` |
| QK pair | Purple | `#762A83` |
| QV pair | Teal | `#01665E` |
| KV pair | Orange | `#E08214` |
| Reference lines | Gray | `#555555` |
**Line encoding**
- Model A (base): solid line
- Model B (RL-tuned): dashed line
- Δ (B − A): gray fill
**Statistical band**
- Thick line = median across heads per layer
- Shaded region = 25%–75% quantile range across heads
- Narrow band = heads behave consistently = model is well-organized
**Shared Y-axis**
- Row 3 (cosU): all three subplots share the same Y range → effect size is comparable
- Row 4 (cosV): same treatment
- Rows 1 & 2: independent Y axes (different physical units)
**Annotations**
- Horizontal dashed lines: theoretical ideals (r=1, SSR=0, α=1) and random baselines
- Vertical dotted lines: global (K=V shared) layers — Gemma-4 specific
- Footer text: lists global layer indices when present
**Output specs**
- Canvas: 18 × 20 inches
- Resolution: 300 DPI (PNG), vector (PDF, SVG)
- Font: DejaVu Sans — Title 11pt / Axis label 10pt / Tick 9pt / Legend 9pt
- Spine: top and right spines removed (clean academic style)
---
Follows the existing three-layer pattern:
```
core/plotter.py ← pure computation, returns Figure objects
↑
ui/tab_plot.py ← UI only, calls plotter + db/reader
↑
app.py ← mounts Tab 5
```
`core/plotter.py` can be used standalone from the command line without
importing any Gradio or DB code:
```python
from core.plotter import plot_single_model, save_figure
import pandas as pd
df = pd.read_csv("my_layer_metrics.csv")
fig = plot_single_model(df, "MyModel", head_dim=128, d_model=5120)
save_figure(fig, "./output/my_model")
```
---
Smoke-tested with synthetic data (48 layers × 32 heads):
- `plot_single_model()` → PNG 2.5 MB, PDF 61 KB, SVG 323 KB ✅
- `plot_compare_models()` → PNG 2.1 MB, PDF 70 KB, SVG 393 KB ✅
- All 12 subplots render correctly with band, baselines, and global-layer markers ✅
- Syntax validated: `ast.parse()` on all three files ✅
---
- [ ] Singular value spectrum snapshot figure (3 layers, Law 1&2 direct visualization)
- [ ] SSR deep-layer trend figure (RL improvement per layer group)
- [ ] Wang Score horizontal bar chart (leaderboard visualization)
- [ ] Auto-refresh model dropdown in Tab 5 without page reload
- [ ] CLI script: `python -m core.plotter --model-id google/gemma-4-e2b`
- app.py +4 -2
- core/plotter.py +485 -0
- ui/tab_plot.py +412 -0
|
@@ -10,10 +10,9 @@ from ui.tab_inspect import build_tab_inspect
|
|
| 10 |
from ui.tab_analyze import build_tab_analyze
|
| 11 |
from ui.tab_leaderboard import build_tab_leaderboard
|
| 12 |
from ui.tab_database import build_tab_database
|
|
|
|
| 13 |
|
| 14 |
# ── 启动时初始化数据库 ────────────────────────
|
| 15 |
-
# 幂等操作,重复调用安全
|
| 16 |
-
# /data 目录由 HF Space bucket 挂载,重启后数据不丢失
|
| 17 |
init_db()
|
| 18 |
|
| 19 |
# ─────────────────────────────────────────────
|
|
@@ -58,6 +57,9 @@ with gr.Blocks(
|
|
| 58 |
# Tab4:数据库浏览
|
| 59 |
build_tab_database()
|
| 60 |
|
|
|
|
|
|
|
|
|
|
| 61 |
# ── Tab1 → Tab2 同步模型 ID 和 token ─────────
|
| 62 |
inspect_model_id.change(
|
| 63 |
fn=lambda x: x,
|
|
|
|
| 10 |
from ui.tab_analyze import build_tab_analyze
|
| 11 |
from ui.tab_leaderboard import build_tab_leaderboard
|
| 12 |
from ui.tab_database import build_tab_database
|
| 13 |
+
from ui.tab_plot import build_tab_plot
|
| 14 |
|
| 15 |
# ── 启动时初始化数据库 ────────────────────────
|
|
|
|
|
|
|
| 16 |
init_db()
|
| 17 |
|
| 18 |
# ─────────────────────────────────────────────
|
|
|
|
| 57 |
# Tab4:数据库浏览
|
| 58 |
build_tab_database()
|
| 59 |
|
| 60 |
+
# Tab5:作图(论文级别)
|
| 61 |
+
build_tab_plot()
|
| 62 |
+
|
| 63 |
# ── Tab1 → Tab2 同步模型 ID 和 token ─────────
|
| 64 |
inspect_model_id.change(
|
| 65 |
fn=lambda x: x,
|
|
@@ -0,0 +1,485 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# core/plotter.py
|
| 2 |
+
"""
|
| 3 |
+
Publication-quality figure generation for Wang's Five Laws.
|
| 4 |
+
Standards: Nature / PRL / top-conference level.
|
| 5 |
+
Canvas: 18×20 inches @ 300 DPI, Arial/Helvetica fonts.
|
| 6 |
+
|
| 7 |
+
Color system:
|
| 8 |
+
Q-related → blue (#2166AC)
|
| 9 |
+
K-related → red (#D6604D)
|
| 10 |
+
V-related → green (#4DAC26)
|
| 11 |
+
QK pair → purple (#762A83)
|
| 12 |
+
QV pair → cyan (#01665E)
|
| 13 |
+
KV pair → orange (#E08214)
|
| 14 |
+
Model A (base) → solid line
|
| 15 |
+
Model B (RL) → dashed line
|
| 16 |
+
Delta → gray fill
|
| 17 |
+
"""
|
| 18 |
+
|
| 19 |
+
import numpy as np
|
| 20 |
+
import pandas as pd
|
| 21 |
+
import matplotlib
|
| 22 |
+
matplotlib.use("Agg")
|
| 23 |
+
import matplotlib.pyplot as plt
|
| 24 |
+
import matplotlib.patches as mpatches
|
| 25 |
+
from matplotlib.lines import Line2D
|
| 26 |
+
import io
|
| 27 |
+
import os
|
| 28 |
+
|
| 29 |
+
# ── Font & style ──────────────────────────────────────────────────────────────
|
| 30 |
+
plt.rcParams.update({
|
| 31 |
+
"font.family": "DejaVu Sans", # fallback; Arial not always present
|
| 32 |
+
"font.size": 9,
|
| 33 |
+
"axes.titlesize": 11,
|
| 34 |
+
"axes.labelsize": 10,
|
| 35 |
+
"xtick.labelsize": 9,
|
| 36 |
+
"ytick.labelsize": 9,
|
| 37 |
+
"legend.fontsize": 9,
|
| 38 |
+
"figure.dpi": 300,
|
| 39 |
+
"savefig.dpi": 300,
|
| 40 |
+
"axes.linewidth": 0.8,
|
| 41 |
+
"grid.linewidth": 0.4,
|
| 42 |
+
"lines.linewidth": 1.5,
|
| 43 |
+
"legend.framealpha": 0.85,
|
| 44 |
+
"legend.edgecolor": "0.7",
|
| 45 |
+
"axes.spines.top": False,
|
| 46 |
+
"axes.spines.right": False,
|
| 47 |
+
})
|
| 48 |
+
|
| 49 |
+
# ── Color palette ─────────────────────────────────────────────────────────────
|
| 50 |
+
C = {
|
| 51 |
+
"Q": "#2166AC", # blue
|
| 52 |
+
"K": "#D6604D", # red
|
| 53 |
+
"V": "#4DAC26", # green
|
| 54 |
+
"QK": "#762A83", # purple
|
| 55 |
+
"QV": "#01665E", # cyan/teal
|
| 56 |
+
"KV": "#E08214", # orange
|
| 57 |
+
"ref": "#555555", # reference line (gray)
|
| 58 |
+
"band_alpha": 0.18,
|
| 59 |
+
}
|
| 60 |
+
|
| 61 |
+
BAND_COLORS = {
|
| 62 |
+
"Q": "#2166AC",
|
| 63 |
+
"K": "#D6604D",
|
| 64 |
+
"QK": "#762A83",
|
| 65 |
+
"QV": "#01665E",
|
| 66 |
+
"KV": "#E08214",
|
| 67 |
+
}
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 71 |
+
# Data helpers
|
| 72 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 73 |
+
|
| 74 |
+
def _aggregate_by_layer(df: pd.DataFrame, col: str):
|
| 75 |
+
"""
|
| 76 |
+
Group by layer, return (layers, median, q25, q75).
|
| 77 |
+
Excludes kv_shared=True rows for KV metrics to avoid theoretical-value bias.
|
| 78 |
+
"""
|
| 79 |
+
kv_cols = {"ssr_KV", "pearson_KV", "cosU_KV", "cosV_KV", "alpha_KV"}
|
| 80 |
+
if col in kv_cols:
|
| 81 |
+
df = df[df["kv_shared"] == 0] if "kv_shared" in df.columns else df
|
| 82 |
+
|
| 83 |
+
grp = df.groupby("layer")[col]
|
| 84 |
+
layers = np.array(sorted(df["layer"].unique()))
|
| 85 |
+
med = grp.median().reindex(layers).values
|
| 86 |
+
q25 = grp.quantile(0.25).reindex(layers).values
|
| 87 |
+
q75 = grp.quantile(0.75).reindex(layers).values
|
| 88 |
+
return layers, med, q25, q75
|
| 89 |
+
|
| 90 |
+
|
| 91 |
+
def _global_layers(df: pd.DataFrame):
|
| 92 |
+
"""Return list of layer indices where kv_shared==True (Gemma global layers)."""
|
| 93 |
+
if "kv_shared" not in df.columns:
|
| 94 |
+
return []
|
| 95 |
+
return sorted(df[df["kv_shared"] == 1]["layer"].unique().tolist())
|
| 96 |
+
|
| 97 |
+
|
| 98 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 99 |
+
# Single-subplot drawing primitives
|
| 100 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 101 |
+
|
| 102 |
+
def _draw_line(ax, layers, med, q25, q75, color, label, linestyle="-",
|
| 103 |
+
show_band=True, global_layers=None):
|
| 104 |
+
ax.plot(layers, med, color=color, linestyle=linestyle,
|
| 105 |
+
linewidth=1.8, label=label, zorder=3)
|
| 106 |
+
if show_band:
|
| 107 |
+
ax.fill_between(layers, q25, q75, color=color,
|
| 108 |
+
alpha=C["band_alpha"], zorder=2)
|
| 109 |
+
if global_layers:
|
| 110 |
+
for gl in global_layers:
|
| 111 |
+
ax.axvline(gl, color="#AAAAAA", linewidth=0.7,
|
| 112 |
+
linestyle=":", zorder=1)
|
| 113 |
+
|
| 114 |
+
|
| 115 |
+
def _add_hline(ax, y, label=None, color=None):
|
| 116 |
+
color = color or C["ref"]
|
| 117 |
+
ax.axhline(y, color=color, linewidth=1.0, linestyle="--",
|
| 118 |
+
alpha=0.75, zorder=1, label=label)
|
| 119 |
+
|
| 120 |
+
|
| 121 |
+
def _finalize_ax(ax, title, ylabel, xlabel="Layer index"):
|
| 122 |
+
ax.set_title(title, fontweight="bold", pad=4)
|
| 123 |
+
ax.set_ylabel(ylabel)
|
| 124 |
+
ax.set_xlabel(xlabel)
|
| 125 |
+
ax.grid(True, axis="y", alpha=0.35)
|
| 126 |
+
ax.legend(loc="best", handlelength=1.5)
|
| 127 |
+
|
| 128 |
+
|
| 129 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 130 |
+
# The 12-panel 4×3 figure (single model)
|
| 131 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 132 |
+
|
| 133 |
+
def plot_single_model(
|
| 134 |
+
df: pd.DataFrame,
|
| 135 |
+
model_name: str,
|
| 136 |
+
show_band: bool = True,
|
| 137 |
+
head_dim: int = 128,
|
| 138 |
+
d_model: int = 5120,
|
| 139 |
+
) -> plt.Figure:
|
| 140 |
+
"""
|
| 141 |
+
4×3 grid, 12 subplots.
|
| 142 |
+
|
| 143 |
+
Row 1 — Law 1 & 2 (singular value metrics):
|
| 144 |
+
[0,0] pearson_QK [0,1] ssr_QK [0,2] alpha_QK
|
| 145 |
+
|
| 146 |
+
Row 2 — Law 3 (condition numbers & max singular values):
|
| 147 |
+
[1,0] sigma_max_Q [1,1] sigma_max_K [1,2] cond_Q & cond_K (dual line)
|
| 148 |
+
|
| 149 |
+
Row 3 — Law 4 (output subspace, left singular vectors U):
|
| 150 |
+
[2,0] cosU_QK [2,1] cosU_QV [2,2] cosU_KV
|
| 151 |
+
+ random baseline 1/√d_head
|
| 152 |
+
|
| 153 |
+
Row 4 — Law 5 (input subspace, right singular vectors V):
|
| 154 |
+
[3,0] cosV_QK [3,1] cosV_QV [3,2] cosV_KV
|
| 155 |
+
+ random baseline 1/√d_model
|
| 156 |
+
"""
|
| 157 |
+
fig, axes = plt.subplots(4, 3, figsize=(18, 20))
|
| 158 |
+
fig.suptitle(
|
| 159 |
+
f"Wang's Five Laws — {model_name}",
|
| 160 |
+
fontsize=14, fontweight="bold", y=0.995
|
| 161 |
+
)
|
| 162 |
+
|
| 163 |
+
gl = _global_layers(df)
|
| 164 |
+
baseline_U = 1.0 / np.sqrt(head_dim)
|
| 165 |
+
baseline_V = 1.0 / np.sqrt(d_model)
|
| 166 |
+
|
| 167 |
+
# ── helper ───────────────────────────────────────────────────────────────
|
| 168 |
+
def draw(ax, col, color, label, linestyle="-"):
|
| 169 |
+
layers, med, q25, q75 = _aggregate_by_layer(df, col)
|
| 170 |
+
_draw_line(ax, layers, med, q25, q75, color, label,
|
| 171 |
+
linestyle=linestyle, show_band=show_band,
|
| 172 |
+
global_layers=gl)
|
| 173 |
+
|
| 174 |
+
# ── Row 0: Law 1 & 2 ─────────────────────────────────────────────────────
|
| 175 |
+
ax = axes[0, 0]
|
| 176 |
+
draw(ax, "pearson_QK", C["QK"], "Pearson r (Q–K)")
|
| 177 |
+
_add_hline(ax, 1.0, "Ideal = 1")
|
| 178 |
+
_finalize_ax(ax, "Law 1 — Spectral Linear Alignment",
|
| 179 |
+
"Pearson r (Q, K spectra)")
|
| 180 |
+
|
| 181 |
+
ax = axes[0, 1]
|
| 182 |
+
draw(ax, "ssr_QK", C["QK"], "SSR (Q–K)")
|
| 183 |
+
_add_hline(ax, 0.0, "Ideal = 0")
|
| 184 |
+
_finalize_ax(ax, "Law 2 — Spectral Shape Fidelity",
|
| 185 |
+
"SSR (Q–K normalized)")
|
| 186 |
+
|
| 187 |
+
ax = axes[0, 2]
|
| 188 |
+
draw(ax, "alpha_QK", C["QK"], "α (Q–K)")
|
| 189 |
+
_add_hline(ax, 1.0, "Ideal = 1")
|
| 190 |
+
_finalize_ax(ax, "Law 1+2 — Scale Factor α (Q–K)",
|
| 191 |
+
"Scale factor α")
|
| 192 |
+
|
| 193 |
+
# ── Row 1: Law 3 ─────────────────────────────────────────────────────────
|
| 194 |
+
ax = axes[1, 0]
|
| 195 |
+
draw(ax, "sigma_max_Q", C["Q"], "σ_max (Q)")
|
| 196 |
+
_finalize_ax(ax, "Law 3 — Max Singular Value (Q)",
|
| 197 |
+
"σ_max")
|
| 198 |
+
|
| 199 |
+
ax = axes[1, 1]
|
| 200 |
+
draw(ax, "sigma_max_K", C["K"], "σ_max (K)")
|
| 201 |
+
_finalize_ax(ax, "Law 3 — Max Singular Value (K)",
|
| 202 |
+
"σ_max")
|
| 203 |
+
|
| 204 |
+
ax = axes[1, 2]
|
| 205 |
+
draw(ax, "cond_Q", C["Q"], "κ(Q)")
|
| 206 |
+
draw(ax, "cond_K", C["K"], "κ(K)")
|
| 207 |
+
_finalize_ax(ax, "Law 3 — Condition Number κ",
|
| 208 |
+
"Condition number κ")
|
| 209 |
+
|
| 210 |
+
# ── Row 2: Law 4 ─────────────────────────────────────────────────────────
|
| 211 |
+
# Share y-axis across this row
|
| 212 |
+
axU = [axes[2, 0], axes[2, 1], axes[2, 2]]
|
| 213 |
+
u_data = {}
|
| 214 |
+
for col in ["cosU_QK", "cosU_QV", "cosU_KV"]:
|
| 215 |
+
_, med, q25, q75 = _aggregate_by_layer(df, col)
|
| 216 |
+
u_data[col] = (med, q25, q75)
|
| 217 |
+
all_u = np.concatenate([np.concatenate([v[1], v[2]]) for v in u_data.values()])
|
| 218 |
+
all_u = all_u[~np.isnan(all_u)]
|
| 219 |
+
if len(all_u) > 0:
|
| 220 |
+
u_ymin = max(0, np.nanmin(all_u) * 0.92)
|
| 221 |
+
u_ymax = np.nanmax(all_u) * 1.08
|
| 222 |
+
else:
|
| 223 |
+
u_ymin, u_ymax = 0, 0.15
|
| 224 |
+
|
| 225 |
+
for (col, color, title_suffix), ax in zip(
|
| 226 |
+
[("cosU_QK", C["QK"], "Q–K"),
|
| 227 |
+
("cosU_QV", C["QV"], "Q–V"),
|
| 228 |
+
("cosU_KV", C["KV"], "K–V")],
|
| 229 |
+
axU
|
| 230 |
+
):
|
| 231 |
+
draw(ax, col, color, f"cosU ({title_suffix})")
|
| 232 |
+
_add_hline(ax, baseline_U,
|
| 233 |
+
f"Random = 1/√d_h ≈ {baseline_U:.4f}")
|
| 234 |
+
ax.set_ylim(u_ymin, u_ymax)
|
| 235 |
+
_finalize_ax(ax, f"Law 4 — Output Subspace cosU ({title_suffix})",
|
| 236 |
+
"Mean |cos| (left singular vectors)")
|
| 237 |
+
|
| 238 |
+
# ── Row 3: Law 5 ─────────────────────────────────────────────────────────
|
| 239 |
+
axV = [axes[3, 0], axes[3, 1], axes[3, 2]]
|
| 240 |
+
v_data = {}
|
| 241 |
+
for col in ["cosV_QK", "cosV_QV", "cosV_KV"]:
|
| 242 |
+
_, med, q25, q75 = _aggregate_by_layer(df, col)
|
| 243 |
+
v_data[col] = (med, q25, q75)
|
| 244 |
+
all_v = np.concatenate([np.concatenate([v[1], v[2]]) for v in v_data.values()])
|
| 245 |
+
all_v = all_v[~np.isnan(all_v)]
|
| 246 |
+
if len(all_v) > 0:
|
| 247 |
+
v_ymin = max(0, np.nanmin(all_v) * 0.92)
|
| 248 |
+
v_ymax = np.nanmax(all_v) * 1.08
|
| 249 |
+
else:
|
| 250 |
+
v_ymin, v_ymax = 0, 0.05
|
| 251 |
+
|
| 252 |
+
for (col, color, title_suffix), ax in zip(
|
| 253 |
+
[("cosV_QK", C["QK"], "Q–K"),
|
| 254 |
+
("cosV_QV", C["QV"], "Q–V"),
|
| 255 |
+
("cosV_KV", C["KV"], "K–V")],
|
| 256 |
+
axV
|
| 257 |
+
):
|
| 258 |
+
draw(ax, col, color, f"cosV ({title_suffix})")
|
| 259 |
+
_add_hline(ax, baseline_V,
|
| 260 |
+
f"Random = 1/√D ≈ {baseline_V:.4f}")
|
| 261 |
+
ax.set_ylim(v_ymin, v_ymax)
|
| 262 |
+
_finalize_ax(ax, f"Law 5 — Input Subspace cosV ({title_suffix})",
|
| 263 |
+
"Mean |cos| (right singular vectors)")
|
| 264 |
+
|
| 265 |
+
# ── Global layer legend ───────────────────────────────────────────────────
|
| 266 |
+
if gl:
|
| 267 |
+
fig.text(
|
| 268 |
+
0.5, 0.001,
|
| 269 |
+
f"Vertical dotted lines mark global (K=V shared) layers: {gl}",
|
| 270 |
+
ha="center", fontsize=8, color="#666666"
|
| 271 |
+
)
|
| 272 |
+
|
| 273 |
+
fig.tight_layout(rect=[0, 0.01, 1, 0.995])
|
| 274 |
+
return fig
|
| 275 |
+
|
| 276 |
+
|
| 277 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 278 |
+
# Two-model comparison figure (same 4×3, dual lines + delta subpanels)
|
| 279 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 280 |
+
|
| 281 |
+
def plot_compare_models(
|
| 282 |
+
df_a: pd.DataFrame,
|
| 283 |
+
df_b: pd.DataFrame,
|
| 284 |
+
name_a: str,
|
| 285 |
+
name_b: str,
|
| 286 |
+
show_band: bool = True,
|
| 287 |
+
show_delta: bool = True,
|
| 288 |
+
head_dim: int = 128,
|
| 289 |
+
d_model: int = 5120,
|
| 290 |
+
) -> plt.Figure:
|
| 291 |
+
"""
|
| 292 |
+
4×3 comparison grid.
|
| 293 |
+
Each subplot: Model A (solid) vs Model B (dashed).
|
| 294 |
+
Delta (B - A) shown as gray fill when show_delta=True.
|
| 295 |
+
"""
|
| 296 |
+
fig, axes = plt.subplots(4, 3, figsize=(18, 20))
|
| 297 |
+
fig.suptitle(
|
| 298 |
+
f"Wang's Five Laws — {name_a} vs {name_b}",
|
| 299 |
+
fontsize=14, fontweight="bold", y=0.995
|
| 300 |
+
)
|
| 301 |
+
|
| 302 |
+
gl_a = _global_layers(df_a)
|
| 303 |
+
gl_b = _global_layers(df_b)
|
| 304 |
+
gl = sorted(set(gl_a) | set(gl_b))
|
| 305 |
+
|
| 306 |
+
baseline_U = 1.0 / np.sqrt(head_dim)
|
| 307 |
+
baseline_V = 1.0 / np.sqrt(d_model)
|
| 308 |
+
|
| 309 |
+
def draw_pair(ax, col, color, label_a, label_b, hline=None, hline_label=None):
|
| 310 |
+
"""Draw Model A (solid) and Model B (dashed) on the same axes."""
|
| 311 |
+
lay_a, med_a, q25_a, q75_a = _aggregate_by_layer(df_a, col)
|
| 312 |
+
lay_b, med_b, q25_b, q75_b = _aggregate_by_layer(df_b, col)
|
| 313 |
+
|
| 314 |
+
_draw_line(ax, lay_a, med_a, q25_a, q75_a, color, label_a,
|
| 315 |
+
linestyle="-", show_band=show_band, global_layers=gl)
|
| 316 |
+
_draw_line(ax, lay_b, med_b, q25_b, q75_b, color, label_b,
|
| 317 |
+
linestyle="--", show_band=show_band, global_layers=None)
|
| 318 |
+
|
| 319 |
+
# Delta fill
|
| 320 |
+
if show_delta:
|
| 321 |
+
common = np.intersect1d(lay_a, lay_b)
|
| 322 |
+
if len(common) > 1:
|
| 323 |
+
idx_a = np.isin(lay_a, common)
|
| 324 |
+
idx_b = np.isin(lay_b, common)
|
| 325 |
+
delta = med_b[idx_b] - med_a[idx_a]
|
| 326 |
+
pos = np.maximum(delta, 0)
|
| 327 |
+
neg = np.minimum(delta, 0)
|
| 328 |
+
ax.fill_between(common, 0, pos,
|
| 329 |
+
color="#AAAAAA", alpha=0.25, zorder=0)
|
| 330 |
+
ax.fill_between(common, 0, neg,
|
| 331 |
+
color="#AAAAAA", alpha=0.25, zorder=0)
|
| 332 |
+
|
| 333 |
+
if hline is not None:
|
| 334 |
+
_add_hline(ax, hline, hline_label)
|
| 335 |
+
|
| 336 |
+
# ── Row 0 ────────────────────────────────────────────────────────────────
|
| 337 |
+
ax = axes[0, 0]
|
| 338 |
+
draw_pair(ax, "pearson_QK", C["QK"],
|
| 339 |
+
f"{name_a} Pearson r", f"{name_b} Pearson r", hline=1.0, hline_label="Ideal=1")
|
| 340 |
+
_finalize_ax(ax, "Law 1 — Spectral Linear Alignment", "Pearson r (Q, K)")
|
| 341 |
+
|
| 342 |
+
ax = axes[0, 1]
|
| 343 |
+
draw_pair(ax, "ssr_QK", C["QK"],
|
| 344 |
+
f"{name_a} SSR", f"{name_b} SSR", hline=0.0, hline_label="Ideal=0")
|
| 345 |
+
_finalize_ax(ax, "Law 2 — Spectral Shape Fidelity", "SSR (Q–K)")
|
| 346 |
+
|
| 347 |
+
ax = axes[0, 2]
|
| 348 |
+
draw_pair(ax, "alpha_QK", C["QK"],
|
| 349 |
+
f"{name_a} α", f"{name_b} α", hline=1.0, hline_label="Ideal=1")
|
| 350 |
+
_finalize_ax(ax, "Law 1+2 — Scale Factor α (Q–K)", "Scale factor α")
|
| 351 |
+
|
| 352 |
+
# ── Row 1 ────────────────────────────────────────────────────────────────
|
| 353 |
+
ax = axes[1, 0]
|
| 354 |
+
draw_pair(ax, "sigma_max_Q", C["Q"],
|
| 355 |
+
f"{name_a} σ_max(Q)", f"{name_b} σ_max(Q)")
|
| 356 |
+
_finalize_ax(ax, "Law 3 — Max Singular Value (Q)", "σ_max")
|
| 357 |
+
|
| 358 |
+
ax = axes[1, 1]
|
| 359 |
+
draw_pair(ax, "sigma_max_K", C["K"],
|
| 360 |
+
f"{name_a} σ_max(K)", f"{name_b} σ_max(K)")
|
| 361 |
+
_finalize_ax(ax, "Law 3 — Max Singular Value (K)", "σ_max")
|
| 362 |
+
|
| 363 |
+
ax = axes[1, 2]
|
| 364 |
+
# cond: draw both Q and K for both models → 4 lines
|
| 365 |
+
lay_a, med_a, q25_a, q75_a = _aggregate_by_layer(df_a, "cond_Q")
|
| 366 |
+
lay_b, med_b, q25_b, q75_b = _aggregate_by_layer(df_b, "cond_Q")
|
| 367 |
+
_draw_line(ax, lay_a, med_a, q25_a, q75_a, C["Q"],
|
| 368 |
+
f"{name_a} κ(Q)", "-", show_band, gl)
|
| 369 |
+
_draw_line(ax, lay_b, med_b, q25_b, q75_b, C["Q"],
|
| 370 |
+
f"{name_b} κ(Q)", "--", show_band, None)
|
| 371 |
+
lay_a, med_a, q25_a, q75_a = _aggregate_by_layer(df_a, "cond_K")
|
| 372 |
+
lay_b, med_b, q25_b, q75_b = _aggregate_by_layer(df_b, "cond_K")
|
| 373 |
+
_draw_line(ax, lay_a, med_a, q25_a, q75_a, C["K"],
|
| 374 |
+
f"{name_a} κ(K)", "-", show_band, None)
|
| 375 |
+
_draw_line(ax, lay_b, med_b, q25_b, q75_b, C["K"],
|
| 376 |
+
f"{name_b} κ(K)", "--", show_band, None)
|
| 377 |
+
_finalize_ax(ax, "Law 3 — Condition Number κ", "Condition number κ")
|
| 378 |
+
|
| 379 |
+
# ── Row 2: Law 4 ─────────────────────────────────────────────────────────
|
| 380 |
+
u_cols = [("cosU_QK", C["QK"], "Q–K"),
|
| 381 |
+
("cosU_QV", C["QV"], "Q–V"),
|
| 382 |
+
("cosU_KV", C["KV"], "K–V")]
|
| 383 |
+
|
| 384 |
+
# Compute shared y range
|
| 385 |
+
u_vals = []
|
| 386 |
+
for col, _, _ in u_cols:
|
| 387 |
+
for df_ in [df_a, df_b]:
|
| 388 |
+
_, med, q25, q75 = _aggregate_by_layer(df_, col)
|
| 389 |
+
u_vals.extend(q25[~np.isnan(q25)].tolist())
|
| 390 |
+
u_vals.extend(q75[~np.isnan(q75)].tolist())
|
| 391 |
+
u_ymin = max(0, min(u_vals) * 0.92) if u_vals else 0
|
| 392 |
+
u_ymax = (max(u_vals) * 1.08) if u_vals else 0.15
|
| 393 |
+
|
| 394 |
+
for (col, color, suffix), ax in zip(u_cols, axes[2]):
|
| 395 |
+
draw_pair(ax, col, color,
|
| 396 |
+
f"{name_a}", f"{name_b}",
|
| 397 |
+
hline=baseline_U,
|
| 398 |
+
hline_label=f"Random 1/√d_h ≈ {baseline_U:.4f}")
|
| 399 |
+
ax.set_ylim(u_ymin, u_ymax)
|
| 400 |
+
_finalize_ax(ax, f"Law 4 — cosU ({suffix})",
|
| 401 |
+
"Mean |cos| (U)")
|
| 402 |
+
|
| 403 |
+
# ── Row 3: Law 5 ─────────────────────────────────────────────────────────
|
| 404 |
+
v_cols = [("cosV_QK", C["QK"], "Q–K"),
|
| 405 |
+
("cosV_QV", C["QV"], "Q–V"),
|
| 406 |
+
("cosV_KV", C["KV"], "K–V")]
|
| 407 |
+
|
| 408 |
+
v_vals = []
|
| 409 |
+
for col, _, _ in v_cols:
|
| 410 |
+
for df_ in [df_a, df_b]:
|
| 411 |
+
_, med, q25, q75 = _aggregate_by_layer(df_, col)
|
| 412 |
+
v_vals.extend(q25[~np.isnan(q25)].tolist())
|
| 413 |
+
v_vals.extend(q75[~np.isnan(q75)].tolist())
|
| 414 |
+
v_ymin = max(0, min(v_vals) * 0.92) if v_vals else 0
|
| 415 |
+
v_ymax = (max(v_vals) * 1.08) if v_vals else 0.05
|
| 416 |
+
|
| 417 |
+
for (col, color, suffix), ax in zip(v_cols, axes[3]):
|
| 418 |
+
draw_pair(ax, col, color,
|
| 419 |
+
f"{name_a}", f"{name_b}",
|
| 420 |
+
hline=baseline_V,
|
| 421 |
+
hline_label=f"Random 1/√D ≈ {baseline_V:.4f}")
|
| 422 |
+
ax.set_ylim(v_ymin, v_ymax)
|
| 423 |
+
_finalize_ax(ax, f"Law 5 — cosV ({suffix})",
|
| 424 |
+
"Mean |cos| (V)")
|
| 425 |
+
|
| 426 |
+
# ── Legend for line styles ────────────────────────────────────────────────
|
| 427 |
+
solid_patch = Line2D([0], [0], color="#333333", linewidth=1.8,
|
| 428 |
+
linestyle="-", label=f"Solid = {name_a}")
|
| 429 |
+
dashed_patch = Line2D([0], [0], color="#333333", linewidth=1.8,
|
| 430 |
+
linestyle="--", label=f"Dashed = {name_b}")
|
| 431 |
+
fig.legend(handles=[solid_patch, dashed_patch],
|
| 432 |
+
loc="lower center", ncol=2, fontsize=9,
|
| 433 |
+
bbox_to_anchor=(0.5, 0.001))
|
| 434 |
+
|
| 435 |
+
if gl:
|
| 436 |
+
fig.text(
|
| 437 |
+
0.5, 0.0045,
|
| 438 |
+
f"Vertical dotted lines mark global (K=V shared) layers: {gl}",
|
| 439 |
+
ha="center", fontsize=8, color="#666666"
|
| 440 |
+
)
|
| 441 |
+
|
| 442 |
+
fig.tight_layout(rect=[0, 0.015, 1, 0.995])
|
| 443 |
+
return fig
|
| 444 |
+
|
| 445 |
+
|
| 446 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 447 |
+
# Export helpers
|
| 448 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 449 |
+
|
| 450 |
+
def save_figure(fig: plt.Figure, base_path: str):
|
| 451 |
+
"""
|
| 452 |
+
Save figure to PNG (300 dpi), PDF (vector), and SVG (vector).
|
| 453 |
+
base_path: path without extension, e.g. "/tmp/wang_laws_gemma"
|
| 454 |
+
Returns list of saved file paths.
|
| 455 |
+
"""
|
| 456 |
+
paths = []
|
| 457 |
+
for fmt, kwargs in [
|
| 458 |
+
("png", {"dpi": 300, "bbox_inches": "tight"}),
|
| 459 |
+
("pdf", {"bbox_inches": "tight"}),
|
| 460 |
+
("svg", {"bbox_inches": "tight"}),
|
| 461 |
+
]:
|
| 462 |
+
p = f"{base_path}.{fmt}"
|
| 463 |
+
fig.savefig(p, format=fmt, **kwargs)
|
| 464 |
+
paths.append(p)
|
| 465 |
+
return paths
|
| 466 |
+
|
| 467 |
+
|
| 468 |
+
def fig_to_png_bytes(fig: plt.Figure) -> bytes:
|
| 469 |
+
"""Return PNG bytes for Gradio Image component."""
|
| 470 |
+
buf = io.BytesIO()
|
| 471 |
+
fig.savefig(buf, format="png", dpi=150, bbox_inches="tight")
|
| 472 |
+
buf.seek(0)
|
| 473 |
+
return buf.read()
|
| 474 |
+
|
| 475 |
+
|
| 476 |
+
def fig_to_plotly(fig_mpl: plt.Figure):
|
| 477 |
+
"""
|
| 478 |
+
Convert matplotlib Figure to a Plotly figure via mpl_to_plotly.
|
| 479 |
+
Requires plotly installed. Falls back gracefully.
|
| 480 |
+
"""
|
| 481 |
+
try:
|
| 482 |
+
import plotly.tools as tls
|
| 483 |
+
return tls.mpl_to_plotly(fig_mpl)
|
| 484 |
+
except Exception:
|
| 485 |
+
return None
|
|
@@ -0,0 +1,412 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ui/tab_plot.py
|
| 2 |
+
"""
|
| 3 |
+
Tab5: Plot — Publication-quality figure generation
|
| 4 |
+
Data pulled from SQLite DB.
|
| 5 |
+
Supports: single model (4×3) and two-model comparison (4×3).
|
| 6 |
+
Export: PNG (300 dpi) / PDF / SVG.
|
| 7 |
+
Engine: matplotlib (publication) + optional Plotly (interactive).
|
| 8 |
+
"""
|
| 9 |
+
|
| 10 |
+
import os
|
| 11 |
+
import tempfile
|
| 12 |
+
import zipfile
|
| 13 |
+
|
| 14 |
+
import gradio as gr
|
| 15 |
+
import pandas as pd
|
| 16 |
+
import numpy as np
|
| 17 |
+
|
| 18 |
+
from db.schema import init_db
|
| 19 |
+
from db.reader import get_layer_metrics, get_analyzed_models
|
| 20 |
+
from core.plotter import (
|
| 21 |
+
plot_single_model,
|
| 22 |
+
plot_compare_models,
|
| 23 |
+
save_figure,
|
| 24 |
+
fig_to_plotly,
|
| 25 |
+
)
|
| 26 |
+
|
| 27 |
+
# ── Output directory ──────────────────────────────────────────────────────────
|
| 28 |
+
_OUT_DIR = "/tmp/wang_plots"
|
| 29 |
+
os.makedirs(_OUT_DIR, exist_ok=True)
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 33 |
+
# DB helpers
|
| 34 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 35 |
+
|
| 36 |
+
def _get_model_choices() -> list[str]:
|
| 37 |
+
try:
|
| 38 |
+
conn = init_db()
|
| 39 |
+
df = get_analyzed_models(conn)
|
| 40 |
+
if df.empty:
|
| 41 |
+
return []
|
| 42 |
+
return df["model_id"].tolist()
|
| 43 |
+
except Exception:
|
| 44 |
+
return []
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
def _load_df(model_id: str, modality: str,
|
| 48 |
+
start_layer: int, end_layer: int) -> pd.DataFrame:
|
| 49 |
+
conn = init_db()
|
| 50 |
+
df = get_layer_metrics(
|
| 51 |
+
conn,
|
| 52 |
+
model_id = model_id,
|
| 53 |
+
modality = modality if modality != "all" else None,
|
| 54 |
+
layer_type = None,
|
| 55 |
+
start_layer = int(start_layer),
|
| 56 |
+
end_layer = int(end_layer),
|
| 57 |
+
)
|
| 58 |
+
return df
|
| 59 |
+
|
| 60 |
+
|
| 61 |
+
def _infer_dims(df: pd.DataFrame) -> tuple[int, int]:
|
| 62 |
+
"""Try to read head_dim and d_model from the dataframe."""
|
| 63 |
+
head_dim = 128
|
| 64 |
+
d_model = 5120
|
| 65 |
+
if not df.empty:
|
| 66 |
+
if "head_dim" in df.columns:
|
| 67 |
+
v = df["head_dim"].dropna()
|
| 68 |
+
if len(v):
|
| 69 |
+
head_dim = int(v.median())
|
| 70 |
+
if "d_model" in df.columns:
|
| 71 |
+
v = df["d_model"].dropna()
|
| 72 |
+
if len(v):
|
| 73 |
+
d_model = int(v.median())
|
| 74 |
+
return head_dim, d_model
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
def _short_name(model_id: str) -> str:
|
| 78 |
+
return model_id.split("/")[-1] if "/" in model_id else model_id
|
| 79 |
+
|
| 80 |
+
|
| 81 |
+
def _safe_base_path(name: str) -> str:
|
| 82 |
+
safe = name.replace("/", "_").replace(" ", "_")
|
| 83 |
+
return os.path.join(_OUT_DIR, safe)
|
| 84 |
+
|
| 85 |
+
|
| 86 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 87 |
+
# Main generation functions
|
| 88 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 89 |
+
|
| 90 |
+
def generate_single(
|
| 91 |
+
model_id: str,
|
| 92 |
+
modality: str,
|
| 93 |
+
start_layer: int,
|
| 94 |
+
end_layer: int,
|
| 95 |
+
show_band: bool,
|
| 96 |
+
progress=gr.Progress()
|
| 97 |
+
) -> tuple:
|
| 98 |
+
"""
|
| 99 |
+
Returns: (status_str, png_path, [png_path, pdf_path, svg_path], plotly_fig)
|
| 100 |
+
"""
|
| 101 |
+
if not model_id or not model_id.strip():
|
| 102 |
+
return "❌ Please select a model.", None, None, None
|
| 103 |
+
|
| 104 |
+
progress(0.1, desc="Loading data from DB...")
|
| 105 |
+
df = _load_df(model_id, modality, start_layer, end_layer)
|
| 106 |
+
|
| 107 |
+
if df.empty:
|
| 108 |
+
return (
|
| 109 |
+
f"❌ No data found for {model_id} "
|
| 110 |
+
f"(modality={modality}, layers {start_layer}~{end_layer}).\n"
|
| 111 |
+
f"Please run analysis first in Tab 2.",
|
| 112 |
+
None, None, None
|
| 113 |
+
)
|
| 114 |
+
|
| 115 |
+
progress(0.35, desc="Inferring dimensions...")
|
| 116 |
+
head_dim, d_model = _infer_dims(df)
|
| 117 |
+
n_layers = df["layer"].nunique()
|
| 118 |
+
n_records = len(df)
|
| 119 |
+
|
| 120 |
+
progress(0.50, desc="Generating matplotlib figure...")
|
| 121 |
+
name = _short_name(model_id)
|
| 122 |
+
fig = plot_single_model(
|
| 123 |
+
df, model_name=name,
|
| 124 |
+
show_band=show_band,
|
| 125 |
+
head_dim=head_dim,
|
| 126 |
+
d_model=d_model,
|
| 127 |
+
)
|
| 128 |
+
|
| 129 |
+
progress(0.75, desc="Saving PNG / PDF / SVG...")
|
| 130 |
+
base = _safe_base_path(f"single_{name}_L{start_layer}-{end_layer}")
|
| 131 |
+
paths = save_figure(fig, base)
|
| 132 |
+
|
| 133 |
+
progress(0.90, desc="Generating Plotly preview...")
|
| 134 |
+
plotly_fig = fig_to_plotly(fig)
|
| 135 |
+
|
| 136 |
+
import matplotlib.pyplot as plt
|
| 137 |
+
plt.close(fig)
|
| 138 |
+
|
| 139 |
+
status = (
|
| 140 |
+
f"✅ {model_id} | modality={modality} "
|
| 141 |
+
f"| layers {start_layer}~{end_layer} "
|
| 142 |
+
f"| {n_layers} layers {n_records} head-records\n"
|
| 143 |
+
f" head_dim={head_dim} d_model={d_model}\n"
|
| 144 |
+
f" Saved: {', '.join(os.path.basename(p) for p in paths)}"
|
| 145 |
+
)
|
| 146 |
+
png_path = paths[0]
|
| 147 |
+
return status, png_path, paths, plotly_fig
|
| 148 |
+
|
| 149 |
+
|
| 150 |
+
def generate_compare(
|
| 151 |
+
model_a: str,
|
| 152 |
+
model_b: str,
|
| 153 |
+
modality: str,
|
| 154 |
+
start_layer: int,
|
| 155 |
+
end_layer: int,
|
| 156 |
+
show_band: bool,
|
| 157 |
+
show_delta: bool,
|
| 158 |
+
progress=gr.Progress()
|
| 159 |
+
) -> tuple:
|
| 160 |
+
if not model_a or not model_b:
|
| 161 |
+
return "❌ Please select both models.", None, None, None
|
| 162 |
+
if model_a == model_b:
|
| 163 |
+
return "❌ Please select two different models.", None, None, None
|
| 164 |
+
|
| 165 |
+
progress(0.10, desc="Loading Model A from DB...")
|
| 166 |
+
df_a = _load_df(model_a, modality, start_layer, end_layer)
|
| 167 |
+
progress(0.25, desc="Loading Model B from DB...")
|
| 168 |
+
df_b = _load_df(model_b, modality, start_layer, end_layer)
|
| 169 |
+
|
| 170 |
+
if df_a.empty:
|
| 171 |
+
return f"❌ No data for Model A ({model_a}).", None, None, None
|
| 172 |
+
if df_b.empty:
|
| 173 |
+
return f"❌ No data for Model B ({model_b}).", None, None, None
|
| 174 |
+
|
| 175 |
+
head_dim_a, d_model_a = _infer_dims(df_a)
|
| 176 |
+
head_dim_b, d_model_b = _infer_dims(df_b)
|
| 177 |
+
head_dim = int((head_dim_a + head_dim_b) / 2)
|
| 178 |
+
d_model = int((d_model_a + d_model_b) / 2)
|
| 179 |
+
|
| 180 |
+
progress(0.50, desc="Generating comparison figure...")
|
| 181 |
+
name_a = _short_name(model_a)
|
| 182 |
+
name_b = _short_name(model_b)
|
| 183 |
+
fig = plot_compare_models(
|
| 184 |
+
df_a, df_b,
|
| 185 |
+
name_a=name_a, name_b=name_b,
|
| 186 |
+
show_band=show_band,
|
| 187 |
+
show_delta=show_delta,
|
| 188 |
+
head_dim=head_dim,
|
| 189 |
+
d_model=d_model,
|
| 190 |
+
)
|
| 191 |
+
|
| 192 |
+
progress(0.80, desc="Saving PNG / PDF / SVG...")
|
| 193 |
+
base = _safe_base_path(f"compare_{name_a}_vs_{name_b}_L{start_layer}-{end_layer}")
|
| 194 |
+
paths = save_figure(fig, base)
|
| 195 |
+
|
| 196 |
+
progress(0.92, desc="Generating Plotly preview...")
|
| 197 |
+
plotly_fig = fig_to_plotly(fig)
|
| 198 |
+
|
| 199 |
+
import matplotlib.pyplot as plt
|
| 200 |
+
plt.close(fig)
|
| 201 |
+
|
| 202 |
+
status = (
|
| 203 |
+
f"✅ {name_a} vs {name_b}\n"
|
| 204 |
+
f" modality={modality} layers {start_layer}~{end_layer}\n"
|
| 205 |
+
f" Model A: {len(df_a)} records | Model B: {len(df_b)} records\n"
|
| 206 |
+
f" head_dim≈{head_dim} d_model≈{d_model}\n"
|
| 207 |
+
f" Saved: {', '.join(os.path.basename(p) for p in paths)}"
|
| 208 |
+
)
|
| 209 |
+
return status, paths[0], paths, plotly_fig
|
| 210 |
+
|
| 211 |
+
|
| 212 |
+
def make_zip(file_paths: list) -> str | None:
|
| 213 |
+
"""Bundle all exported files into a single ZIP for download."""
|
| 214 |
+
if not file_paths:
|
| 215 |
+
return None
|
| 216 |
+
zip_path = os.path.join(_OUT_DIR, "wang_laws_figures.zip")
|
| 217 |
+
with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as zf:
|
| 218 |
+
for p in file_paths:
|
| 219 |
+
if p and os.path.exists(p):
|
| 220 |
+
zf.write(p, os.path.basename(p))
|
| 221 |
+
return zip_path
|
| 222 |
+
|
| 223 |
+
|
| 224 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 225 |
+
# Tab5 UI
|
| 226 |
+
# ─────────────────────────────────────────────────────────────────────────────
|
| 227 |
+
|
| 228 |
+
def build_tab_plot():
|
| 229 |
+
with gr.Tab("📈 Plot"):
|
| 230 |
+
gr.Markdown("""
|
| 231 |
+
## Wang's Five Laws — Publication-Quality Figures
|
| 232 |
+
Data is loaded directly from the SQLite database (Tab 2 must be run first).
|
| 233 |
+
|
| 234 |
+
**4×3 grid layout** (12 subplots, one figure):
|
| 235 |
+
| Row | Content | Laws |
|
| 236 |
+
|-----|---------|------|
|
| 237 |
+
| 1 | pearson_QK · SSR_QK · α_QK | Law 1 & 2 |
|
| 238 |
+
| 2 | σ_max(Q) · σ_max(K) · κ(Q) & κ(K) | Law 3 |
|
| 239 |
+
| 3 | cosU QK · QV · KV + random baseline | Law 4 |
|
| 240 |
+
| 4 | cosV QK · QV · KV + random baseline | Law 5 |
|
| 241 |
+
|
| 242 |
+
Export: **PNG 300 dpi** · **PDF (vector)** · **SVG (vector)**
|
| 243 |
+
""")
|
| 244 |
+
|
| 245 |
+
# ── Shared controls ───────────────────────────────────────────────────
|
| 246 |
+
with gr.Row():
|
| 247 |
+
modality_sel = gr.Dropdown(
|
| 248 |
+
label="Modality",
|
| 249 |
+
choices=["language", "vision", "audio", "all"],
|
| 250 |
+
value="language",
|
| 251 |
+
scale=1,
|
| 252 |
+
)
|
| 253 |
+
start_l = gr.Number(
|
| 254 |
+
label="Start Layer", value=0, precision=0, scale=1
|
| 255 |
+
)
|
| 256 |
+
end_l = gr.Number(
|
| 257 |
+
label="End Layer", value=47, precision=0, scale=1
|
| 258 |
+
)
|
| 259 |
+
show_band_chk = gr.Checkbox(
|
| 260 |
+
label="Show 25%–75% band (head consistency)",
|
| 261 |
+
value=True, scale=1
|
| 262 |
+
)
|
| 263 |
+
|
| 264 |
+
gr.Markdown("---")
|
| 265 |
+
|
| 266 |
+
# ══ Mode 1: Single model ══════════════════════════════════════════════
|
| 267 |
+
with gr.Accordion("📊 Single Model", open=True):
|
| 268 |
+
with gr.Row():
|
| 269 |
+
model_choices = _get_model_choices()
|
| 270 |
+
single_model = gr.Dropdown(
|
| 271 |
+
label="Model",
|
| 272 |
+
choices=model_choices,
|
| 273 |
+
value=model_choices[0] if model_choices else None,
|
| 274 |
+
allow_custom_value=True,
|
| 275 |
+
scale=3,
|
| 276 |
+
info="Refresh the page after analyzing new models to update this list."
|
| 277 |
+
)
|
| 278 |
+
single_btn = gr.Button(
|
| 279 |
+
"🎨 Generate Figure", variant="primary", scale=1
|
| 280 |
+
)
|
| 281 |
+
|
| 282 |
+
single_status = gr.Textbox(
|
| 283 |
+
label="Status", lines=3, interactive=False
|
| 284 |
+
)
|
| 285 |
+
|
| 286 |
+
with gr.Tabs():
|
| 287 |
+
with gr.Tab("🖼️ Preview (PNG)"):
|
| 288 |
+
single_img = gr.Image(
|
| 289 |
+
label="Figure preview",
|
| 290 |
+
type="filepath",
|
| 291 |
+
height=600,
|
| 292 |
+
)
|
| 293 |
+
with gr.Tab("📉 Interactive (Plotly)"):
|
| 294 |
+
single_plotly = gr.Plot(label="Plotly interactive")
|
| 295 |
+
|
| 296 |
+
with gr.Row():
|
| 297 |
+
dl_single_png = gr.File(label="⬇ PNG (300 dpi)")
|
| 298 |
+
dl_single_pdf = gr.File(label="⬇ PDF (vector)")
|
| 299 |
+
dl_single_svg = gr.File(label="⬇ SVG (vector)")
|
| 300 |
+
dl_single_zip = gr.File(label="⬇ ZIP (all formats)")
|
| 301 |
+
|
| 302 |
+
gr.Markdown("---")
|
| 303 |
+
|
| 304 |
+
# ══ Mode 2: Two-model comparison ══════════════════════════════════════
|
| 305 |
+
with gr.Accordion("📊 Two-Model Comparison", open=False):
|
| 306 |
+
with gr.Row():
|
| 307 |
+
model_a = gr.Dropdown(
|
| 308 |
+
label="Model A (solid line)",
|
| 309 |
+
choices=model_choices,
|
| 310 |
+
value=model_choices[0] if len(model_choices) > 0 else None,
|
| 311 |
+
allow_custom_value=True,
|
| 312 |
+
scale=2,
|
| 313 |
+
)
|
| 314 |
+
model_b = gr.Dropdown(
|
| 315 |
+
label="Model B (dashed line)",
|
| 316 |
+
choices=model_choices,
|
| 317 |
+
value=model_choices[1] if len(model_choices) > 1 else None,
|
| 318 |
+
allow_custom_value=True,
|
| 319 |
+
scale=2,
|
| 320 |
+
)
|
| 321 |
+
show_delta_chk = gr.Checkbox(
|
| 322 |
+
label="Show Δ (B − A) fill",
|
| 323 |
+
value=True, scale=1
|
| 324 |
+
)
|
| 325 |
+
compare_btn = gr.Button(
|
| 326 |
+
"🎨 Generate Comparison", variant="primary", scale=1
|
| 327 |
+
)
|
| 328 |
+
|
| 329 |
+
compare_status = gr.Textbox(
|
| 330 |
+
label="Status", lines=3, interactive=False
|
| 331 |
+
)
|
| 332 |
+
|
| 333 |
+
with gr.Tabs():
|
| 334 |
+
with gr.Tab("🖼️ Preview (PNG)"):
|
| 335 |
+
compare_img = gr.Image(
|
| 336 |
+
label="Comparison figure preview",
|
| 337 |
+
type="filepath",
|
| 338 |
+
height=600,
|
| 339 |
+
)
|
| 340 |
+
with gr.Tab("📉 Interactive (Plotly)"):
|
| 341 |
+
compare_plotly = gr.Plot(label="Plotly interactive")
|
| 342 |
+
|
| 343 |
+
with gr.Row():
|
| 344 |
+
dl_cmp_png = gr.File(label="⬇ PNG (300 dpi)")
|
| 345 |
+
dl_cmp_pdf = gr.File(label="⬇ PDF (vector)")
|
| 346 |
+
dl_cmp_svg = gr.File(label="⬇ SVG (vector)")
|
| 347 |
+
dl_cmp_zip = gr.File(label="⬇ ZIP (all formats)")
|
| 348 |
+
|
| 349 |
+
gr.Markdown("""
|
| 350 |
+
---
|
| 351 |
+
**Tips**
|
| 352 |
+
- Band = 25%–75% quantile across attention heads per layer.
|
| 353 |
+
Narrow band → heads behave consistently → model is "well-organized".
|
| 354 |
+
- Vertical dotted lines mark **global layers** (K=V shared, e.g. Gemma-4).
|
| 355 |
+
- Dashed horizontal lines = theoretical ideals or random baselines.
|
| 356 |
+
- For Law 4 & 5 panels, Q–V and K–V cosU values **below** the random baseline
|
| 357 |
+
indicate **super-orthogonality** — a key signature of pretraining convergence.
|
| 358 |
+
""")
|
| 359 |
+
|
| 360 |
+
# ── Wire up single model ──────────────────────────────────────────────
|
| 361 |
+
_single_file_state = gr.State([])
|
| 362 |
+
|
| 363 |
+
def _run_single(model_id, modality, start, end, band, progress=gr.Progress()):
|
| 364 |
+
status, png, paths, plotly_fig = generate_single(
|
| 365 |
+
model_id, modality, int(start), int(end), band, progress
|
| 366 |
+
)
|
| 367 |
+
if paths is None:
|
| 368 |
+
return status, None, None, None, None, None, None, []
|
| 369 |
+
zip_p = make_zip(paths)
|
| 370 |
+
png_p = paths[0] if len(paths) > 0 else None
|
| 371 |
+
pdf_p = paths[1] if len(paths) > 1 else None
|
| 372 |
+
svg_p = paths[2] if len(paths) > 2 else None
|
| 373 |
+
return (status, png, plotly_fig,
|
| 374 |
+
png_p, pdf_p, svg_p, zip_p, paths)
|
| 375 |
+
|
| 376 |
+
single_btn.click(
|
| 377 |
+
fn=_run_single,
|
| 378 |
+
inputs=[single_model, modality_sel, start_l, end_l, show_band_chk],
|
| 379 |
+
outputs=[
|
| 380 |
+
single_status, single_img, single_plotly,
|
| 381 |
+
dl_single_png, dl_single_pdf, dl_single_svg, dl_single_zip,
|
| 382 |
+
_single_file_state,
|
| 383 |
+
]
|
| 384 |
+
)
|
| 385 |
+
|
| 386 |
+
# ── Wire up comparison ────────────────────────────────────────────────
|
| 387 |
+
_compare_file_state = gr.State([])
|
| 388 |
+
|
| 389 |
+
def _run_compare(ma, mb, modality, start, end, band, delta,
|
| 390 |
+
progress=gr.Progress()):
|
| 391 |
+
status, png, paths, plotly_fig = generate_compare(
|
| 392 |
+
ma, mb, modality, int(start), int(end), band, delta, progress
|
| 393 |
+
)
|
| 394 |
+
if paths is None:
|
| 395 |
+
return status, None, None, None, None, None, None, []
|
| 396 |
+
zip_p = make_zip(paths)
|
| 397 |
+
png_p = paths[0] if len(paths) > 0 else None
|
| 398 |
+
pdf_p = paths[1] if len(paths) > 1 else None
|
| 399 |
+
svg_p = paths[2] if len(paths) > 2 else None
|
| 400 |
+
return (status, png, plotly_fig,
|
| 401 |
+
png_p, pdf_p, svg_p, zip_p, paths)
|
| 402 |
+
|
| 403 |
+
compare_btn.click(
|
| 404 |
+
fn=_run_compare,
|
| 405 |
+
inputs=[model_a, model_b, modality_sel,
|
| 406 |
+
start_l, end_l, show_band_chk, show_delta_chk],
|
| 407 |
+
outputs=[
|
| 408 |
+
compare_status, compare_img, compare_plotly,
|
| 409 |
+
dl_cmp_png, dl_cmp_pdf, dl_cmp_svg, dl_cmp_zip,
|
| 410 |
+
_compare_file_state,
|
| 411 |
+
]
|
| 412 |
+
)
|