Upload technical_report_p32.md with huggingface_hub
Browse files- technical_report_p32.md +7 -5
technical_report_p32.md
CHANGED
|
@@ -37,11 +37,13 @@ The patchification is applied as an outer wrapper:
|
|
| 37 |
Running stats (mean/variance for whitening) are tracked in the patchified
|
| 38 |
512-channel space.
|
| 39 |
|
| 40 |
-
Semantic alignment
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
|
|
|
|
|
|
| 45 |
|
| 46 |
## Training
|
| 47 |
|
|
|
|
| 37 |
Running stats (mean/variance for whitening) are tracked in the patchified
|
| 38 |
512-channel space.
|
| 39 |
|
| 40 |
+
Semantic alignment operates **before** patchification, in the 128-channel
|
| 41 |
+
space. The VP posterior variance expansion loss and the latent scale penalty
|
| 42 |
+
(log-variance L2) operate **after** patchification, in the 512-channel
|
| 43 |
+
space. The variance expansion loss is per-element, so this is equivalent to
|
| 44 |
+
operating before patchification (the 2x2 reshape is lossless and the mean
|
| 45 |
+
over all elements is invariant to the rearrangement). Running stats for
|
| 46 |
+
whitening are also tracked in the patchified space.
|
| 47 |
|
| 48 |
## Training
|
| 49 |
|