| --- |
| language: en |
| license: apache-2.0 |
| base_model: black-forest-labs/FLUX.2-klein-base-4B |
| library_name: diffusers |
| tags: |
| - interpretability |
| - per-head-attention |
| - paired-prompt-probe |
| - inverse-rendering |
| - flux2 |
| - vision-banana |
| - arxiv:2604.20329 |
| pipeline_tag: image-to-image |
| --- |
| |
| # ray-plantain |
|
|
| A per-head attention probe of FLUX.2 Klein 4B testing whether the base model has implicitly separated illumination from reflectance β i.e., whether intrinsic-image decomposition emerges as a representational axis without inverse-rendering supervision. |
|
|
| ## Thesis |
|
|
| Intrinsic-image decomposition (separating the rendered photograph into its physically meaningful albedo, normal, and lighting components) is a classical inverse-rendering problem that has historically required either explicit supervision or carefully designed self-supervision. ray-plantain tests whether the recipe is unnecessary β whether a 4 B image-generation model trained on natural photographs has discovered the rendered-vs-albedo distinction on its own, surfacing it as a per-head axis at base. |
|
|
| ## Method |
|
|
| Twenty-five paired prompts holding the depicted scene constant. The A condition describes the final rendered photograph (full lighting, shadows, surface reflections). The B condition requests the lighting-free albedo / surface-color component only. Per-head capture protocol identical to the rest of the plantain probe family. |
|
|
| Rigor add-ons: per-head Cohen's d effect size; split-half consistency via 100 random 50/50 stimulus splits. |
|
|
| ## Results |
|
|
| | Metric | Value | Significance | |
| |--------------------------------|------------------|----------------------------| |
| | Heads with \|t\| > 3 | 10,846 (66.5%) | 7.6Γ empirical null p99 | |
| | Heads with \|t\| > 5 | 7,445 (45.6%) | 1,489Γ empirical null p99 | |
| | Heads with \|d\| > 0.8 (large) | 9,136 (56.0%) | β | |
| | Split-half r (median) | 0.912 | [0.91, 0.92] IQR | |
| | Max \|t\| | 34.48 | β | |
|
|
| **Top blocks by max \|t\|:** |
| - single[0]: max\|t\|=34.48, 510/768 heads at \|t\|>3, median \|d\|=1.00 |
| - joint[0]: max\|t\|=28.91, 169/192 heads at \|t\|>3, median \|d\|=2.54 |
| - joint[4]: max\|t\|=27.84, 149/192 heads at \|t\|>3, median \|d\|=1.32 |
| - joint[3]: max\|t\|=26.80, 151/192 heads at \|t\|>3, median \|d\|=1.22 |
| - joint[1]: max\|t\|=26.76, 135/192 heads at \|t\|>3, median \|d\|=1.10 |
|
|
| **Interpretation.** The strongest plantain probe finding to date by every rigor metric. **Over half of all 16,320 attention heads** in the model show large-effect-size selectivity (Cohen's d > 0.8) for the rendered/albedo distinction. Split-half consistency r=0.91 places the axis among the most reproducible representational features located so far. The 1,489Γ ratio over the empirical null at |t|>5 is the highest reported across the plantain probe family. |
|
|
| The result implies that Klein has implicitly performed an inverse-rendering decomposition during pretraining and made the components separately addressable at the per-head level. The maximum-effect block (joint[0]) has 169 of 192 heads at |t|>3 with median Cohen's d above 2.5 β the rendered/albedo distinction is the dominant feature partition for that block. This places a measurable upper bound on how much of the inverse-rendering literature's machinery is actually necessary; the base model already separates illumination from reflectance internally, and the open question shifts to whether the separation is *correct* (matches physically computed albedo) rather than whether it exists. |
|
|
| ## Status |
|
|
| Probe complete. No LoRA training; this is a base-model interpretability finding. |
|
|
| ## Limitations |
|
|
| The probe establishes that Klein has a separable representation of "rendered photograph" vs. "albedo." It does not establish that the predicted albedo is *correct* β i.e., physically faithful to a held-out lighting decomposition. A natural follow-up generates the predicted albedo at full inference depth and compares to physically computed albedo on a synthetic-image test set with known intrinsics. |
|
|
| The pair count is 25; per-head reproducibility (r=0.91) is high but a larger sweep would tighten estimates and enable per-block confidence intervals. |
|
|
| The probe is correlational. |
|
|
| ## License |
|
|
| Apache 2.0 β matches base FLUX.2 Klein 4B. |
|
|
| ## References |
|
|
| - Gabeur, V., Long, S., Peng, S., et al. *Image Generators are Generalist Vision Learners.* [arXiv:2604.20329](https://arxiv.org/abs/2604.20329) (2026). |
| - Black Forest Labs. *FLUX.2 Klein.* https://bfl.ai/models/flux-2-klein (2025). |
|
|