rogermt commited on
Commit
e2ddfc5
·
verified ·
1 Parent(s): b0f731e

Update TODO.md — σ=0 achieved, all blockers resolved, document findings"

Browse files
Files changed (1) hide show
  1. TODO.md +57 -26
TODO.md CHANGED
@@ -1,42 +1,73 @@
1
  # TODO (Prioritised)
2
 
3
- ## Immediate (blockers)
4
- - ~~**Add candidate snapshot to beam logs**~~ ✅ Done — `beam_logging.py` logs candidates; `fix_and_inspect_logs.py` attaches `candidate_array` post‑hoc.
5
- - ~~**Ensure gate values are booleans**~~ ✅ Done — `coerce_gates()` in `fix_and_inspect_logs.py` and `postprocess_logs.py`.
6
- - ~~**Make tile transform nontrivial**~~ ⚠️ Partial — `ShiftedTile` exists in `transforms.py` but is **not wired into `default_atomic_factory`**. The beam only sees vanilla `tile_to_target` (idempotent), so σ never decreases. **This is the critical blocker.**
7
- - ~~**Implement robust fill_enclosed**~~ ✅ Done — BFS implementation in `solver_core.py`.
8
 
9
- ### ➡️ Next: Fix σ=98 flatline (residue never decreases)
10
- The beam composes `tile_to_target ∘ tile_to_target` which is idempotent — tiling an already-tiled grid returns the same grid. The atomic library needs transforms that can actually express the input→target mapping.
11
 
12
- **Plan:**
13
- 1. **Wire `ShiftedTile` into `default_atomic_factory`** so the beam can explore shifted tilings.
14
- 2. **Add region‑aware placement transforms** — the example1 target is a 3×3 arrangement of sub‑blocks where each 3×3 quadrant is a conditional variant of the input (some reflected, some identity, some zero‑filled). The library needs transforms that place sub‑blocks at specific offsets.
15
- 3. **Add conditional fill / mask‑based composition** — transforms that zero‑out or overwrite specific quadrants based on input symmetry.
16
- 4. **Re‑run sweep** with enriched library and verify σ decreases below 98.
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  ## Short term
19
  - ~~Add CLI entrypoint with `--use_wandb` flag~~ ✅ Done — `scripts/entrypoint.py`.
20
- - Add unit tests for `tile_transform`, `fill_enclosed`, and transform `.apply()` semantics.
21
- - `tests.py` has `run_atomic_effects` and `transform_effect_test` but no proper pytest suite yet.
22
  - Add small visualization notebook for `phi_best`, diff maps, and Layer‑1 masks.
23
 
24
  ## Medium term
25
- - ~~Improve Layer‑1 mask generation~~ ✅ Done — percentile + min_abs + dilation in `layer_minus_one.py`.
26
  - Add a toggle to include/exclude `candidate_array` in logs to control log size.
27
- - ~~Create a reproducible benchmark harness for parameter sweeps and CSV aggregation~~ ✅ Done — `experiment_driver.sweep()` + `results.csv`. Needs more tasks.
 
 
28
 
29
  ## Long term
30
- - ~~Integrate a safe external W&B uploader~~ ✅ Done — `wandb_runner.py` runs after experiments finish, decoupled from core.
31
- - Add more ARC tasks and automated evaluation harness.
 
 
32
  - Document reproducibility steps and expected outputs for each example task.
33
 
34
  ## Code hygiene (completed)
35
- - ~~Duplicate `Transform` class in `transforms.py`~~ ✅ Fixed — now imports from `solver_core`.
36
- - ~~Duplicate imports/paste blocks in `solver_core.py`~~ ✅ Fixed — single clean definition.
37
- - ~~Lambda closure bug in `default_atomic_factory`~~ ✅ Fixed — captures `target_shape` by value.
38
- - ~~`wandb_runner.py` `int(generate_id(), 36)` crash~~ ✅ Fixed — uses string ID directly.
39
- - ~~`minimal_runner.py` TARGET was all‑zeros~~ ✅ Fixed — uses real example1 target.
40
- - ~~README.md referenced non‑existent paths~~ ✅ Fixed — corrected to actual repo structure.
41
- - ~~Committed `.pyc` files~~ ✅ Fixed — deleted + `.gitignore` added.
42
- - ~~`itt_solver/README.md.md` double extension~~ ✅ Fixed — renamed to `README.md`.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # TODO (Prioritised)
2
 
3
+ ## σ=0 ACHIEVED — Solver works!
 
 
 
 
4
 
5
+ The solver now finds `Id∘KroneckerSelfSimilar` at depth 1 and achieves **σ=0** on all 6 pairs of ARC task 007bbfb7.
 
6
 
7
+ ### What was wrong
8
+ 1. **Wrong target** the repo's example1 target had 4 incorrect cells (row 7). Discovered by cross-referencing against the real ARC dataset (`data/training/007bbfb7.json`). Fixed.
9
+ 2. **Limited transform library** — the beam only had vanilla tile, fill_enclosed, rotate90, reflect_h. None could express the Kronecker self-similar pattern. Fixed: added 19 new transforms.
10
+ 3. **Beam only tried resized input** — shape-changing transforms (Kronecker: 3×3→9×9) need the original input, not the tiled 9×9 intermediate. The beam now uses a dual-strategy: each transform is tried on both the resized field AND the original input.
11
+
12
+ ### The actual transformation (ARC task 007bbfb7)
13
+ `output = np.kron((input != 0).astype(int), input)` — a Kronecker product where the input's own nonzero mask determines the meta-layout for placing copies of itself.
14
+
15
+ ---
16
+
17
+ ## Immediate (blockers) — ALL RESOLVED
18
+ - ~~**Add candidate snapshot to beam logs**~~ ✅ Done.
19
+ - ~~**Ensure gate values are booleans**~~ ✅ Done.
20
+ - ~~**Make tile transform nontrivial**~~ ✅ Done — `ShiftedTile` wired in + 19 new transforms including `KroneckerSelfSimilar`.
21
+ - ~~**Implement robust fill_enclosed**~~ ✅ Done — BFS in `solver_core.py`.
22
+ - ~~**Fix σ=98 flatline**~~ ✅ Done — σ=0 on all 6 pairs.
23
 
24
  ## Short term
25
  - ~~Add CLI entrypoint with `--use_wandb` flag~~ ✅ Done — `scripts/entrypoint.py`.
26
+ - ~~Add unit tests for transforms~~ Done `tests/test_transforms.py` (40 tests, all pass).
 
27
  - Add small visualization notebook for `phi_best`, diff maps, and Layer‑1 masks.
28
 
29
  ## Medium term
30
+ - ~~Improve Layer‑1 mask generation~~ ✅ Done.
31
  - Add a toggle to include/exclude `candidate_array` in logs to control log size.
32
+ - ~~Create a reproducible benchmark harness~~ ✅ Done — `experiment_driver.sweep()` + `results.csv`.
33
+ - **Expand to more ARC tasks** — test on other 3×3→9×9 tasks and different task families.
34
+ - **Benchmark the enriched library** — run sweep across multiple ARC tasks, measure solve rate.
35
 
36
  ## Long term
37
+ - ~~Integrate a safe external W&B uploader~~ ✅ Done.
38
+ - **Build task loader for full ARC dataset** load any task from `fchollet/ARC-AGI` by ID.
39
+ - **Add more transform families** — connected components, object extraction, voronoi fill (see Icecuber DSL: arxiv:2402.03507).
40
+ - **Automated evaluation harness** — run solver on all 400 ARC training tasks, report solve rate.
41
  - Document reproducibility steps and expected outputs for each example task.
42
 
43
  ## Code hygiene (completed)
44
+ - ~~Duplicate `Transform` class in `transforms.py`~~ ✅ Fixed.
45
+ - ~~Duplicate imports/paste blocks in `solver_core.py`~~ ✅ Fixed.
46
+ - ~~Lambda closure bug in `default_atomic_factory`~~ ✅ Fixed.
47
+ - ~~`wandb_runner.py` `int(generate_id(), 36)` crash~~ ✅ Fixed.
48
+ - ~~`minimal_runner.py` TARGET was all‑zeros~~ ✅ Fixed.
49
+ - ~~README.md referenced non‑existent paths~~ ✅ Fixed.
50
+ - ~~Committed `.pyc` files~~ ✅ Fixed.
51
+ - ~~`itt_solver/README.md.md` double extension~~ ✅ Fixed.
52
+ - ~~Wrong target in example1 (4 cells off)~~ ✅ Fixed — corrected in entrypoint.py, experiments_analysis.py, fix_and_inspect_logs.py, minimal_runner.py.
53
+ - ~~Beam search only applied transforms to resized field~~ ✅ Fixed — dual-strategy (resized + original).
54
+
55
+ ## New transforms added (19 total)
56
+ | Transform | Description |
57
+ |---|---|
58
+ | `KroneckerSelfSimilar` | `kron((I≠0), I)` — self-similar meta-layout |
59
+ | `KroneckerSelfSimilarInv` | `kron(I, (I≠0))` — mirror variant |
60
+ | `MirrorTileH` | `[abc\|cba]` horizontal mirror |
61
+ | `MirrorTileV` | Vertical mirror stack |
62
+ | `MirrorTile4Way` | Full kaleidoscope (D4) |
63
+ | `Upscale(2)` / `Upscale(3)` | Pixel-repeat zoom |
64
+ | `Downscale(2)` | Subsample (inverse of upscale) |
65
+ | `StackH(3)` / `StackV(3)` | Tile horizontally/vertically |
66
+ | `RetainColor(c)` | Keep only color c |
67
+ | `RemoveColor(c)` | Zero out color c |
68
+ | `InvertColors` | Swap black ↔ top color |
69
+ | `GravityDown` / `GravityUp` | Pixels fall/rise in columns |
70
+ | `OverlayTransparent(bg)` | Transparent overlay on background |
71
+ | `CropToContent` | Crop to non-zero bounding box |
72
+ | `Transpose` | Matrix transpose |
73
+ | `ShiftedTile` | Tile with roll offset |