ray-plantain probe results README

c5e54f5 verified 8 days ago

4.67 kB

	---
	language: en
	license: apache-2.0
	base_model: black-forest-labs/FLUX.2-klein-base-4B
	library_name: diffusers
	tags:
	- interpretability
	- per-head-attention
	- paired-prompt-probe
	- inverse-rendering
	- flux2
	- vision-banana
	- arxiv:2604.20329
	pipeline_tag: image-to-image
	---

	# ray-plantain

	A per-head attention probe of FLUX.2 Klein 4B testing whether the base model has implicitly separated illumination from reflectance — i.e., whether intrinsic-image decomposition emerges as a representational axis without inverse-rendering supervision.

	## Thesis

	Intrinsic-image decomposition (separating the rendered photograph into its physically meaningful albedo, normal, and lighting components) is a classical inverse-rendering problem that has historically required either explicit supervision or carefully designed self-supervision. ray-plantain tests whether the recipe is unnecessary — whether a 4 B image-generation model trained on natural photographs has discovered the rendered-vs-albedo distinction on its own, surfacing it as a per-head axis at base.

	## Method

	Twenty-five paired prompts holding the depicted scene constant. The A condition describes the final rendered photograph (full lighting, shadows, surface reflections). The B condition requests the lighting-free albedo / surface-color component only. Per-head capture protocol identical to the rest of the plantain probe family.

	Rigor add-ons: per-head Cohen's d effect size; split-half consistency via 100 random 50/50 stimulus splits.

	## Results

	\| Metric \| Value \| Significance \|
	\|--------------------------------\|------------------\|----------------------------\|
	\| Heads with \\|t\\| > 3 \| 10,846 (66.5%) \| 7.6× empirical null p99 \|
	\| Heads with \\|t\\| > 5 \| 7,445 (45.6%) \| 1,489× empirical null p99 \|
	\| Heads with \\|d\\| > 0.8 (large) \| 9,136 (56.0%) \| — \|
	\| Split-half r (median) \| 0.912 \| [0.91, 0.92] IQR \|
	\| Max \\|t\\| \| 34.48 \| — \|

	Top blocks by max \\|t\\|:
	- single[0]: max\\|t\\|=34.48, 510/768 heads at \\|t\\|>3, median \\|d\\|=1.00
	- joint[0]: max\\|t\\|=28.91, 169/192 heads at \\|t\\|>3, median \\|d\\|=2.54
	- joint[4]: max\\|t\\|=27.84, 149/192 heads at \\|t\\|>3, median \\|d\\|=1.32
	- joint[3]: max\\|t\\|=26.80, 151/192 heads at \\|t\\|>3, median \\|d\\|=1.22
	- joint[1]: max\\|t\\|=26.76, 135/192 heads at \\|t\\|>3, median \\|d\\|=1.10

	Interpretation. The strongest plantain probe finding to date by every rigor metric. Over half of all 16,320 attention heads in the model show large-effect-size selectivity (Cohen's d > 0.8) for the rendered/albedo distinction. Split-half consistency r=0.91 places the axis among the most reproducible representational features located so far. The 1,489× ratio over the empirical null at \|t\|>5 is the highest reported across the plantain probe family.

	The result implies that Klein has implicitly performed an inverse-rendering decomposition during pretraining and made the components separately addressable at the per-head level. The maximum-effect block (joint[0]) has 169 of 192 heads at \|t\|>3 with median Cohen's d above 2.5 — the rendered/albedo distinction is the dominant feature partition for that block. This places a measurable upper bound on how much of the inverse-rendering literature's machinery is actually necessary; the base model already separates illumination from reflectance internally, and the open question shifts to whether the separation is correct (matches physically computed albedo) rather than whether it exists.

	## Status

	Probe complete. No LoRA training; this is a base-model interpretability finding.

	## Limitations

	The probe establishes that Klein has a separable representation of "rendered photograph" vs. "albedo." It does not establish that the predicted albedo is correct — i.e., physically faithful to a held-out lighting decomposition. A natural follow-up generates the predicted albedo at full inference depth and compares to physically computed albedo on a synthetic-image test set with known intrinsics.

	The pair count is 25; per-head reproducibility (r=0.91) is high but a larger sweep would tighten estimates and enable per-block confidence intervals.

	The probe is correlational.

	## License

	Apache 2.0 — matches base FLUX.2 Klein 4B.

	## References

	- Gabeur, V., Long, S., Peng, S., et al. Image Generators are Generalist Vision Learners. [arXiv:2604.20329](https://arxiv.org/abs/2604.20329) (2026).
	- Black Forest Labs. FLUX.2 Klein. https://bfl.ai/models/flux-2-klein (2025).