Add full NYU Eigen test table to pending/ README
Browse files- pending/README.md +16 -6
pending/README.md
CHANGED
|
@@ -8,16 +8,26 @@ Staged weights from a follow-up training run that's substantially better than th
|
|
| 8 |
- `text_encoder/` β full Qwen3-4B text encoder with the rank-32 text-encoder LoRA fused in (`merge_and_unload`). Loads as a drop-in `text_encoder` for `Flux2KleinPipeline`.
|
| 9 |
- `tokenizer/` β Qwen3 tokenizer (unchanged from base; included so the pending/ folder is self-contained).
|
| 10 |
|
| 11 |
-
##
|
| 12 |
|
| 13 |
-
| metric | rank-32 baseline
|
| 14 |
|---|---|---|---|
|
| 15 |
-
|
|
| 16 |
-
| Ξ΄
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
-
|
| 19 |
|
| 20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
## Training
|
| 23 |
|
|
|
|
| 8 |
- `text_encoder/` β full Qwen3-4B text encoder with the rank-32 text-encoder LoRA fused in (`merge_and_unload`). Loads as a drop-in `text_encoder` for `Flux2KleinPipeline`.
|
| 9 |
- `tokenizer/` β Qwen3 tokenizer (unchanged from base; included so the pending/ folder is self-contained).
|
| 10 |
|
| 11 |
+
## Full NYU Eigen test (490 frames, 28 inference steps at 768 Γ 768)
|
| 12 |
|
| 13 |
+
| metric | this checkpoint | rank-32 baseline (full set) | Vision Banana paper (full set) |
|
| 14 |
|---|---|---|---|
|
| 15 |
+
| Ξ΄β | **0.745** | 0.370 | 0.948 |
|
| 16 |
+
| Ξ΄β | 0.958 | β | β |
|
| 17 |
+
| Ξ΄β | 0.988 | β | β |
|
| 18 |
+
| AbsRel | **0.163** | 0.461 | 0.074 |
|
| 19 |
+
| RMSE | **0.596 m** | 1.566 m | β |
|
| 20 |
|
| 21 |
+
Doubled Ξ΄β and more than halved RMSE versus the rank-32 LoRA baseline at the same architecture level. Ξ΄β ~0.20 below paper-level β the paper full-instruction-tunes a hundreds-of-billions-parameter Nano Banana Pro on its original training mixture; this is rank-256 LoRA on a 4B open base at 23% of a 15 000-step schedule.
|
| 22 |
|
| 23 |
+
## Hardest 10 NYU frames (worst per-frame RMSE on the rank-32 baseline)
|
| 24 |
+
|
| 25 |
+
| metric | rank-32 baseline | this checkpoint |
|
| 26 |
+
|---|---|---|
|
| 27 |
+
| RMSE (m) | 3.0β4.7 | **0.436** |
|
| 28 |
+
| Ξ΄β | 0.00β0.38 | **0.819** |
|
| 29 |
+
|
| 30 |
+
Going from 3β4 m catastrophic failure to 0.44 m on those exact scenes, with Ξ΄β jumping from near-zero to 0.82, is what motivated the full eval.
|
| 31 |
|
| 32 |
## Training
|
| 33 |
|