File size: 4,674 Bytes
c5e54f5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
language: en
license: apache-2.0
base_model: black-forest-labs/FLUX.2-klein-base-4B
library_name: diffusers
tags:
  - interpretability
  - per-head-attention
  - paired-prompt-probe
  - inverse-rendering
  - flux2
  - vision-banana
  - arxiv:2604.20329
pipeline_tag: image-to-image
---

# ray-plantain

A per-head attention probe of FLUX.2 Klein 4B testing whether the base model has implicitly separated illumination from reflectance — i.e., whether intrinsic-image decomposition emerges as a representational axis without inverse-rendering supervision.

## Thesis

Intrinsic-image decomposition (separating the rendered photograph into its physically meaningful albedo, normal, and lighting components) is a classical inverse-rendering problem that has historically required either explicit supervision or carefully designed self-supervision. ray-plantain tests whether the recipe is unnecessary — whether a 4 B image-generation model trained on natural photographs has discovered the rendered-vs-albedo distinction on its own, surfacing it as a per-head axis at base.

## Method

Twenty-five paired prompts holding the depicted scene constant. The A condition describes the final rendered photograph (full lighting, shadows, surface reflections). The B condition requests the lighting-free albedo / surface-color component only. Per-head capture protocol identical to the rest of the plantain probe family.

Rigor add-ons: per-head Cohen's d effect size; split-half consistency via 100 random 50/50 stimulus splits.

## Results

| Metric                         | Value            | Significance               |
|--------------------------------|------------------|----------------------------|
| Heads with \|t\| > 3           | 10,846 (66.5%)   | 7.6× empirical null p99    |
| Heads with \|t\| > 5           | 7,445 (45.6%)    | 1,489× empirical null p99  |
| Heads with \|d\| > 0.8 (large) | 9,136 (56.0%)    | —                          |
| Split-half r (median)          | 0.912            | [0.91, 0.92] IQR           |
| Max \|t\|                      | 34.48            | —                          |

**Top blocks by max \|t\|:**
- single[0]: max\|t\|=34.48, 510/768 heads at \|t\|>3, median \|d\|=1.00
- joint[0]: max\|t\|=28.91, 169/192 heads at \|t\|>3, median \|d\|=2.54
- joint[4]: max\|t\|=27.84, 149/192 heads at \|t\|>3, median \|d\|=1.32
- joint[3]: max\|t\|=26.80, 151/192 heads at \|t\|>3, median \|d\|=1.22
- joint[1]: max\|t\|=26.76, 135/192 heads at \|t\|>3, median \|d\|=1.10

**Interpretation.** The strongest plantain probe finding to date by every rigor metric. **Over half of all 16,320 attention heads** in the model show large-effect-size selectivity (Cohen's d > 0.8) for the rendered/albedo distinction. Split-half consistency r=0.91 places the axis among the most reproducible representational features located so far. The 1,489× ratio over the empirical null at |t|>5 is the highest reported across the plantain probe family.

The result implies that Klein has implicitly performed an inverse-rendering decomposition during pretraining and made the components separately addressable at the per-head level. The maximum-effect block (joint[0]) has 169 of 192 heads at |t|>3 with median Cohen's d above 2.5 — the rendered/albedo distinction is the dominant feature partition for that block. This places a measurable upper bound on how much of the inverse-rendering literature's machinery is actually necessary; the base model already separates illumination from reflectance internally, and the open question shifts to whether the separation is *correct* (matches physically computed albedo) rather than whether it exists.

## Status

Probe complete. No LoRA training; this is a base-model interpretability finding.

## Limitations

The probe establishes that Klein has a separable representation of "rendered photograph" vs. "albedo." It does not establish that the predicted albedo is *correct* — i.e., physically faithful to a held-out lighting decomposition. A natural follow-up generates the predicted albedo at full inference depth and compares to physically computed albedo on a synthetic-image test set with known intrinsics.

The pair count is 25; per-head reproducibility (r=0.91) is high but a larger sweep would tighten estimates and enable per-block confidence intervals.

The probe is correlational.

## License

Apache 2.0 — matches base FLUX.2 Klein 4B.

## References

- Gabeur, V., Long, S., Peng, S., et al. *Image Generators are Generalist Vision Learners.* [arXiv:2604.20329](https://arxiv.org/abs/2604.20329) (2026).
- Black Forest Labs. *FLUX.2 Klein.* https://bfl.ai/models/flux-2-klein (2025).