Update results: 40/400 (10.0%) solved — +9 new tasks, 0 regressions, greedy stacker working
Browse files- arc_results/RESULTS.md +56 -51
arc_results/RESULTS.md
CHANGED
|
@@ -1,17 +1,33 @@
|
|
| 1 |
# PEMF Solver — ARC-AGI Training Set Evaluation
|
| 2 |
|
| 3 |
-
## Results
|
| 4 |
|
| 5 |
-
| Metric |
|
| 6 |
-
|---|---|
|
| 7 |
-
| **
|
| 8 |
-
|
|
| 9 |
-
|
|
| 10 |
-
| **
|
| 11 |
-
| **
|
| 12 |
-
| **
|
| 13 |
|
| 14 |
-
## Solved Tasks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
| Task ID | Transform | Family |
|
| 17 |
|---|---|---|
|
|
@@ -19,13 +35,18 @@
|
|
| 19 |
| 1190e5a7 | KroneckerSelfSimilarInv | Self-similar |
|
| 20 |
| 1cf80156 | CropToContent | Crop |
|
| 21 |
| 1e0a9b12 | GravityDown | Gravity |
|
|
|
|
| 22 |
| 2013d3e2 | CropToContent | Crop |
|
|
|
|
|
|
|
| 23 |
| 239be575 | tile_to_target | Tiling |
|
|
|
|
| 24 |
| 28bf18c6 | CropToContent | Crop |
|
| 25 |
| 2dee498d | tile_to_target | Tiling |
|
| 26 |
| 3906de3d | GravityUp | Gravity |
|
| 27 |
| 3af2c5a8 | MirrorTileH | Mirror |
|
| 28 |
| 3c9b0459 | Rotate_180 | Rotation |
|
|
|
|
| 29 |
| 6150a2bd | Rotate_180 | Rotation |
|
| 30 |
| 62c24649 | MirrorTile4Way | Mirror |
|
| 31 |
| 67a3c6ac | Reflect_v | Reflection |
|
|
@@ -33,62 +54,46 @@
|
|
| 33 |
| 68b16354 | Reflect_h | Reflection |
|
| 34 |
| 6d0aefbc | MirrorTileH | Mirror |
|
| 35 |
| 6fa7a44f | MirrorTileV | Mirror |
|
|
|
|
| 36 |
| 74dd1130 | Transpose | Transpose |
|
| 37 |
| 7b7f7511 | tile_to_target | Tiling |
|
| 38 |
| 8be77c9e | MirrorTileV | Mirror |
|
| 39 |
| 9172f3a0 | Upscale_3x | Upscale |
|
| 40 |
| 9dfd6313 | Transpose | Transpose |
|
| 41 |
| a416b8f3 | tile_to_target | Tiling |
|
|
|
|
| 42 |
| c59eb873 | Upscale_2x | Upscale |
|
| 43 |
| c9e6f938 | MirrorTileH | Mirror |
|
| 44 |
| d10ecb37 | tile_to_target | Tiling |
|
| 45 |
-
| d631b094 |
|
| 46 |
| d9fac9be | tile_to_target | Tiling |
|
| 47 |
-
| de1cd16c |
|
|
|
|
|
|
|
| 48 |
| ed36ccf7 | Rotate_90 | Rotation |
|
| 49 |
|
| 50 |
-
##
|
| 51 |
-
|
| 52 |
-
| Transform | Pairs |
|
| 53 |
-
|---|---|
|
| 54 |
-
| tile_to_target | 34 |
|
| 55 |
-
| CropToContent | 19 |
|
| 56 |
-
| MirrorTile4Way | 11 |
|
| 57 |
-
| Rotate_90 | 9 |
|
| 58 |
-
| MirrorTileH | 8 |
|
| 59 |
-
| Rotate_180 | 7 |
|
| 60 |
-
| MirrorTileV | 7 |
|
| 61 |
-
| Transpose | 7 |
|
| 62 |
-
| Upscale_2x | 6 |
|
| 63 |
-
| ShiftedTile | 6 |
|
| 64 |
-
| Upscale_3x | 6 |
|
| 65 |
-
| KroneckerSelfSimilar | 5 |
|
| 66 |
-
| GravityUp | 5 |
|
| 67 |
-
| GravityDown | 4 |
|
| 68 |
-
| Reflect_h | 4 |
|
| 69 |
-
| KroneckerSelfSimilarInv | 3 |
|
| 70 |
-
| Reflect_v | 3 |
|
| 71 |
-
| InvertColors | 1 |
|
| 72 |
-
| Rotate_270 | 1 |
|
| 73 |
-
|
| 74 |
-
**Every single new transform contributed to at least one solve.**
|
| 75 |
-
|
| 76 |
-
## Unsolved σ Distribution
|
| 77 |
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
|
| 88 |
-
|
|
|
|
| 89 |
|
| 90 |
-
##
|
| 91 |
-
|
| 92 |
|
| 93 |
## Parameters
|
| 94 |
```json
|
|
|
|
| 1 |
# PEMF Solver — ARC-AGI Training Set Evaluation
|
| 2 |
|
| 3 |
+
## Results (v2 — object layer + greedy stacker)
|
| 4 |
|
| 5 |
+
| Metric | v1 | **v2** | Change |
|
| 6 |
+
|---|---|---|---|
|
| 7 |
+
| **Tasks solved** | 31 (7.8%) | **40 (10.0%)** | **+9 (+29%)** |
|
| 8 |
+
| ≥1 pair solved | 59 (14.8%) | — | — |
|
| 9 |
+
| Pairs solved | 146 (11.2%) | — | — |
|
| 10 |
+
| Transforms | 19 | **33** | +14 |
|
| 11 |
+
| Total time | 17.1s | **50.8s** | +greedy stacker |
|
| 12 |
+
| Regressions | — | **0** | — |
|
| 13 |
|
| 14 |
+
## 9 Newly Solved Tasks
|
| 15 |
+
|
| 16 |
+
| Task ID | Transform | Category |
|
| 17 |
+
|---|---|---|
|
| 18 |
+
| 1f85a75f | ExtractLargestObject | **Object extraction** |
|
| 19 |
+
| 22168020 | ConnectSameColorH | **Connect** |
|
| 20 |
+
| 22eb0ac0 | ConnectSameColorH | **Connect** |
|
| 21 |
+
| 23b5c85d | ExtractSmallestObject | **Object extraction** |
|
| 22 |
+
| 4347f46a | DrawBorder | **Border** |
|
| 23 |
+
| 746b3537 | tile_to_target | Tiling (new pair coverage) |
|
| 24 |
+
| be94b721 | CropToContent | Crop (new pair coverage) |
|
| 25 |
+
| ded97339 | overlay(ConnectSameColorV, ConnectSameColorH) | **Greedy stacker** |
|
| 26 |
+
| eb5a1d5d | CompressGrid | **Compress** |
|
| 27 |
+
|
| 28 |
+
**6 of the 9 new solves use new transforms.** 1 uses the greedy stacker overlay composition.
|
| 29 |
+
|
| 30 |
+
## All 40 Solved Tasks
|
| 31 |
|
| 32 |
| Task ID | Transform | Family |
|
| 33 |
|---|---|---|
|
|
|
|
| 35 |
| 1190e5a7 | KroneckerSelfSimilarInv | Self-similar |
|
| 36 |
| 1cf80156 | CropToContent | Crop |
|
| 37 |
| 1e0a9b12 | GravityDown | Gravity |
|
| 38 |
+
| 1f85a75f | ExtractLargestObject | Object |
|
| 39 |
| 2013d3e2 | CropToContent | Crop |
|
| 40 |
+
| 22168020 | ConnectSameColorH | Connect |
|
| 41 |
+
| 22eb0ac0 | ConnectSameColorH | Connect |
|
| 42 |
| 239be575 | tile_to_target | Tiling |
|
| 43 |
+
| 23b5c85d | ExtractSmallestObject | Object |
|
| 44 |
| 28bf18c6 | CropToContent | Crop |
|
| 45 |
| 2dee498d | tile_to_target | Tiling |
|
| 46 |
| 3906de3d | GravityUp | Gravity |
|
| 47 |
| 3af2c5a8 | MirrorTileH | Mirror |
|
| 48 |
| 3c9b0459 | Rotate_180 | Rotation |
|
| 49 |
+
| 4347f46a | DrawBorder | Border |
|
| 50 |
| 6150a2bd | Rotate_180 | Rotation |
|
| 51 |
| 62c24649 | MirrorTile4Way | Mirror |
|
| 52 |
| 67a3c6ac | Reflect_v | Reflection |
|
|
|
|
| 54 |
| 68b16354 | Reflect_h | Reflection |
|
| 55 |
| 6d0aefbc | MirrorTileH | Mirror |
|
| 56 |
| 6fa7a44f | MirrorTileV | Mirror |
|
| 57 |
+
| 746b3537 | tile_to_target | Tiling |
|
| 58 |
| 74dd1130 | Transpose | Transpose |
|
| 59 |
| 7b7f7511 | tile_to_target | Tiling |
|
| 60 |
| 8be77c9e | MirrorTileV | Mirror |
|
| 61 |
| 9172f3a0 | Upscale_3x | Upscale |
|
| 62 |
| 9dfd6313 | Transpose | Transpose |
|
| 63 |
| a416b8f3 | tile_to_target | Tiling |
|
| 64 |
+
| be94b721 | CropToContent | Crop |
|
| 65 |
| c59eb873 | Upscale_2x | Upscale |
|
| 66 |
| c9e6f938 | MirrorTileH | Mirror |
|
| 67 |
| d10ecb37 | tile_to_target | Tiling |
|
| 68 |
+
| d631b094 | ExtractLargestObject | Object |
|
| 69 |
| d9fac9be | tile_to_target | Tiling |
|
| 70 |
+
| de1cd16c | KeepSmallestObject | Object |
|
| 71 |
+
| ded97339 | overlay(ConnectSameColorV, ConnectSameColorH) | Stacker |
|
| 72 |
+
| eb5a1d5d | CompressGrid | Compress |
|
| 73 |
| ed36ccf7 | Rotate_90 | Rotation |
|
| 74 |
|
| 75 |
+
## Architecture
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 76 |
|
| 77 |
+
### 33 Atomic Transforms
|
| 78 |
+
- **Tiling**: tile_to_target, ShiftedTile, FillEnclosedHarmonic
|
| 79 |
+
- **Self-similar**: KroneckerSelfSimilar, KroneckerSelfSimilarInv
|
| 80 |
+
- **Mirror**: MirrorTileH, MirrorTileV, MirrorTile4Way
|
| 81 |
+
- **Upscale**: Upscale_2x, Upscale_3x
|
| 82 |
+
- **Stack**: StackH_3, StackV_3
|
| 83 |
+
- **Structural**: Transpose, CropToContent
|
| 84 |
+
- **Object**: ExtractLargest/Smallest/Unique/MostCommon, KeepLargest/Smallest, SortBySize
|
| 85 |
+
- **Fill/Connect**: FillInterior, ConnectSameColorH/V
|
| 86 |
+
- **Compress**: CompressGrid, RemoveBlackLines
|
| 87 |
+
- **Spatial**: ColorByProximity, DrawBorder
|
| 88 |
+
- **Symmetry**: Rotate_90/180/270, Reflect_h/v
|
| 89 |
+
- **Gravity**: GravityDown, GravityUp (optional)
|
| 90 |
+
- **Color**: InvertColors (optional)
|
| 91 |
|
| 92 |
+
### Greedy Stacker
|
| 93 |
+
After beam search, tries `overlay(T1(input), T2(input))` for the top-N depth-1 pieces. This allows composition of two independent transforms — critical for tasks where the output is a combination of two views.
|
| 94 |
|
| 95 |
+
### Object Layer (`object_layer.py`)
|
| 96 |
+
Connected component extraction (4-conn, 8-conn), color splitting, list reducers (largest/smallest/most_common/unique), spatial queries (bbox, center, height, width), and composition (paint/underpaint/overlay/cover/canvas).
|
| 97 |
|
| 98 |
## Parameters
|
| 99 |
```json
|