Update TODO.md — σ=0 achieved, all blockers resolved, document findings"
Browse files
TODO.md
CHANGED
|
@@ -1,42 +1,73 @@
|
|
| 1 |
# TODO (Prioritised)
|
| 2 |
|
| 3 |
-
##
|
| 4 |
-
- ~~**Add candidate snapshot to beam logs**~~ ✅ Done — `beam_logging.py` logs candidates; `fix_and_inspect_logs.py` attaches `candidate_array` post‑hoc.
|
| 5 |
-
- ~~**Ensure gate values are booleans**~~ ✅ Done — `coerce_gates()` in `fix_and_inspect_logs.py` and `postprocess_logs.py`.
|
| 6 |
-
- ~~**Make tile transform nontrivial**~~ ⚠️ Partial — `ShiftedTile` exists in `transforms.py` but is **not wired into `default_atomic_factory`**. The beam only sees vanilla `tile_to_target` (idempotent), so σ never decreases. **This is the critical blocker.**
|
| 7 |
-
- ~~**Implement robust fill_enclosed**~~ ✅ Done — BFS implementation in `solver_core.py`.
|
| 8 |
|
| 9 |
-
|
| 10 |
-
The beam composes `tile_to_target ∘ tile_to_target` which is idempotent — tiling an already-tiled grid returns the same grid. The atomic library needs transforms that can actually express the input→target mapping.
|
| 11 |
|
| 12 |
-
|
| 13 |
-
1. **
|
| 14 |
-
2. **
|
| 15 |
-
3. **
|
| 16 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
## Short term
|
| 19 |
- ~~Add CLI entrypoint with `--use_wandb` flag~~ ✅ Done — `scripts/entrypoint.py`.
|
| 20 |
-
- Add unit tests for
|
| 21 |
-
- `tests.py` has `run_atomic_effects` and `transform_effect_test` but no proper pytest suite yet.
|
| 22 |
- Add small visualization notebook for `phi_best`, diff maps, and Layer‑1 masks.
|
| 23 |
|
| 24 |
## Medium term
|
| 25 |
-
- ~~Improve Layer‑1 mask generation~~ ✅ Done
|
| 26 |
- Add a toggle to include/exclude `candidate_array` in logs to control log size.
|
| 27 |
-
- ~~Create a reproducible benchmark harness
|
|
|
|
|
|
|
| 28 |
|
| 29 |
## Long term
|
| 30 |
-
- ~~Integrate a safe external W&B uploader~~ ✅ Done
|
| 31 |
-
-
|
|
|
|
|
|
|
| 32 |
- Document reproducibility steps and expected outputs for each example task.
|
| 33 |
|
| 34 |
## Code hygiene (completed)
|
| 35 |
-
- ~~Duplicate `Transform` class in `transforms.py`~~ ✅ Fixed
|
| 36 |
-
- ~~Duplicate imports/paste blocks in `solver_core.py`~~ ✅ Fixed
|
| 37 |
-
- ~~Lambda closure bug in `default_atomic_factory`~~ ✅ Fixed
|
| 38 |
-
- ~~`wandb_runner.py` `int(generate_id(), 36)` crash~~ ✅ Fixed
|
| 39 |
-
- ~~`minimal_runner.py` TARGET was all‑zeros~~ ✅ Fixed
|
| 40 |
-
- ~~README.md referenced non‑existent paths~~ ✅ Fixed
|
| 41 |
-
- ~~Committed `.pyc` files~~ ✅ Fixed
|
| 42 |
-
- ~~`itt_solver/README.md.md` double extension~~ ✅ Fixed
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# TODO (Prioritised)
|
| 2 |
|
| 3 |
+
## ✅ σ=0 ACHIEVED — Solver works!
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
+
The solver now finds `Id∘KroneckerSelfSimilar` at depth 1 and achieves **σ=0** on all 6 pairs of ARC task 007bbfb7.
|
|
|
|
| 6 |
|
| 7 |
+
### What was wrong
|
| 8 |
+
1. **Wrong target** — the repo's example1 target had 4 incorrect cells (row 7). Discovered by cross-referencing against the real ARC dataset (`data/training/007bbfb7.json`). Fixed.
|
| 9 |
+
2. **Limited transform library** — the beam only had vanilla tile, fill_enclosed, rotate90, reflect_h. None could express the Kronecker self-similar pattern. Fixed: added 19 new transforms.
|
| 10 |
+
3. **Beam only tried resized input** — shape-changing transforms (Kronecker: 3×3→9×9) need the original input, not the tiled 9×9 intermediate. The beam now uses a dual-strategy: each transform is tried on both the resized field AND the original input.
|
| 11 |
+
|
| 12 |
+
### The actual transformation (ARC task 007bbfb7)
|
| 13 |
+
`output = np.kron((input != 0).astype(int), input)` — a Kronecker product where the input's own nonzero mask determines the meta-layout for placing copies of itself.
|
| 14 |
+
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
## Immediate (blockers) — ALL RESOLVED
|
| 18 |
+
- ~~**Add candidate snapshot to beam logs**~~ ✅ Done.
|
| 19 |
+
- ~~**Ensure gate values are booleans**~~ ✅ Done.
|
| 20 |
+
- ~~**Make tile transform nontrivial**~~ ✅ Done — `ShiftedTile` wired in + 19 new transforms including `KroneckerSelfSimilar`.
|
| 21 |
+
- ~~**Implement robust fill_enclosed**~~ ✅ Done — BFS in `solver_core.py`.
|
| 22 |
+
- ~~**Fix σ=98 flatline**~~ ✅ Done — σ=0 on all 6 pairs.
|
| 23 |
|
| 24 |
## Short term
|
| 25 |
- ~~Add CLI entrypoint with `--use_wandb` flag~~ ✅ Done — `scripts/entrypoint.py`.
|
| 26 |
+
- ~~Add unit tests for transforms~~ ✅ Done — `tests/test_transforms.py` (40 tests, all pass).
|
|
|
|
| 27 |
- Add small visualization notebook for `phi_best`, diff maps, and Layer‑1 masks.
|
| 28 |
|
| 29 |
## Medium term
|
| 30 |
+
- ~~Improve Layer‑1 mask generation~~ ✅ Done.
|
| 31 |
- Add a toggle to include/exclude `candidate_array` in logs to control log size.
|
| 32 |
+
- ~~Create a reproducible benchmark harness~~ ✅ Done — `experiment_driver.sweep()` + `results.csv`.
|
| 33 |
+
- **Expand to more ARC tasks** — test on other 3×3→9×9 tasks and different task families.
|
| 34 |
+
- **Benchmark the enriched library** — run sweep across multiple ARC tasks, measure solve rate.
|
| 35 |
|
| 36 |
## Long term
|
| 37 |
+
- ~~Integrate a safe external W&B uploader~~ ✅ Done.
|
| 38 |
+
- **Build task loader for full ARC dataset** — load any task from `fchollet/ARC-AGI` by ID.
|
| 39 |
+
- **Add more transform families** — connected components, object extraction, voronoi fill (see Icecuber DSL: arxiv:2402.03507).
|
| 40 |
+
- **Automated evaluation harness** — run solver on all 400 ARC training tasks, report solve rate.
|
| 41 |
- Document reproducibility steps and expected outputs for each example task.
|
| 42 |
|
| 43 |
## Code hygiene (completed)
|
| 44 |
+
- ~~Duplicate `Transform` class in `transforms.py`~~ ✅ Fixed.
|
| 45 |
+
- ~~Duplicate imports/paste blocks in `solver_core.py`~~ ✅ Fixed.
|
| 46 |
+
- ~~Lambda closure bug in `default_atomic_factory`~~ ✅ Fixed.
|
| 47 |
+
- ~~`wandb_runner.py` `int(generate_id(), 36)` crash~~ ✅ Fixed.
|
| 48 |
+
- ~~`minimal_runner.py` TARGET was all‑zeros~~ ✅ Fixed.
|
| 49 |
+
- ~~README.md referenced non‑existent paths~~ ✅ Fixed.
|
| 50 |
+
- ~~Committed `.pyc` files~~ ✅ Fixed.
|
| 51 |
+
- ~~`itt_solver/README.md.md` double extension~~ ✅ Fixed.
|
| 52 |
+
- ~~Wrong target in example1 (4 cells off)~~ ✅ Fixed — corrected in entrypoint.py, experiments_analysis.py, fix_and_inspect_logs.py, minimal_runner.py.
|
| 53 |
+
- ~~Beam search only applied transforms to resized field~~ ✅ Fixed — dual-strategy (resized + original).
|
| 54 |
+
|
| 55 |
+
## New transforms added (19 total)
|
| 56 |
+
| Transform | Description |
|
| 57 |
+
|---|---|
|
| 58 |
+
| `KroneckerSelfSimilar` | `kron((I≠0), I)` — self-similar meta-layout |
|
| 59 |
+
| `KroneckerSelfSimilarInv` | `kron(I, (I≠0))` — mirror variant |
|
| 60 |
+
| `MirrorTileH` | `[abc\|cba]` horizontal mirror |
|
| 61 |
+
| `MirrorTileV` | Vertical mirror stack |
|
| 62 |
+
| `MirrorTile4Way` | Full kaleidoscope (D4) |
|
| 63 |
+
| `Upscale(2)` / `Upscale(3)` | Pixel-repeat zoom |
|
| 64 |
+
| `Downscale(2)` | Subsample (inverse of upscale) |
|
| 65 |
+
| `StackH(3)` / `StackV(3)` | Tile horizontally/vertically |
|
| 66 |
+
| `RetainColor(c)` | Keep only color c |
|
| 67 |
+
| `RemoveColor(c)` | Zero out color c |
|
| 68 |
+
| `InvertColors` | Swap black ↔ top color |
|
| 69 |
+
| `GravityDown` / `GravityUp` | Pixels fall/rise in columns |
|
| 70 |
+
| `OverlayTransparent(bg)` | Transparent overlay on background |
|
| 71 |
+
| `CropToContent` | Crop to non-zero bounding box |
|
| 72 |
+
| `Transpose` | Matrix transpose |
|
| 73 |
+
| `ShiftedTile` | Tile with roll offset |
|