Update results: 47/400 (11.8%) — ITT solves 16, DSL solves 31, +7 new from ITT physics"
Browse files- arc_results/RESULTS.md +47 -98
arc_results/RESULTS.md
CHANGED
|
@@ -1,117 +1,66 @@
|
|
| 1 |
# PEMF Solver — ARC-AGI Training Set Evaluation
|
| 2 |
|
| 3 |
-
## Results (
|
| 4 |
|
| 5 |
-
| Metric | v1 | **
|
| 6 |
|---|---|---|---|
|
| 7 |
-
| **Tasks solved** | 31 (7.8%) |
|
| 8 |
-
|
|
| 9 |
-
|
|
| 10 |
-
| Transforms | 19 |
|
| 11 |
-
|
|
| 12 |
-
|
|
|
|
|
| 13 |
|
| 14 |
-
##
|
| 15 |
|
| 16 |
-
| Task ID |
|
| 17 |
|---|---|---|
|
| 18 |
-
|
|
| 19 |
-
|
|
| 20 |
-
|
|
| 21 |
-
|
|
| 22 |
-
|
|
| 23 |
-
|
|
| 24 |
-
|
|
| 25 |
-
| ded97339 | overlay(ConnectSameColorV, ConnectSameColorH) | **Greedy stacker** |
|
| 26 |
-
| eb5a1d5d | CompressGrid | **Compress** |
|
| 27 |
|
| 28 |
-
**
|
| 29 |
-
|
| 30 |
-
## All 40 Solved Tasks
|
| 31 |
-
|
| 32 |
-
| Task ID | Transform | Family |
|
| 33 |
-
|---|---|---|
|
| 34 |
-
| 007bbfb7 | KroneckerSelfSimilar | Self-similar |
|
| 35 |
-
| 1190e5a7 | KroneckerSelfSimilarInv | Self-similar |
|
| 36 |
-
| 1cf80156 | CropToContent | Crop |
|
| 37 |
-
| 1e0a9b12 | GravityDown | Gravity |
|
| 38 |
-
| 1f85a75f | ExtractLargestObject | Object |
|
| 39 |
-
| 2013d3e2 | CropToContent | Crop |
|
| 40 |
-
| 22168020 | ConnectSameColorH | Connect |
|
| 41 |
-
| 22eb0ac0 | ConnectSameColorH | Connect |
|
| 42 |
-
| 239be575 | tile_to_target | Tiling |
|
| 43 |
-
| 23b5c85d | ExtractSmallestObject | Object |
|
| 44 |
-
| 28bf18c6 | CropToContent | Crop |
|
| 45 |
-
| 2dee498d | tile_to_target | Tiling |
|
| 46 |
-
| 3906de3d | GravityUp | Gravity |
|
| 47 |
-
| 3af2c5a8 | MirrorTileH | Mirror |
|
| 48 |
-
| 3c9b0459 | Rotate_180 | Rotation |
|
| 49 |
-
| 4347f46a | DrawBorder | Border |
|
| 50 |
-
| 6150a2bd | Rotate_180 | Rotation |
|
| 51 |
-
| 62c24649 | MirrorTile4Way | Mirror |
|
| 52 |
-
| 67a3c6ac | Reflect_v | Reflection |
|
| 53 |
-
| 67e8384a | MirrorTile4Way | Mirror |
|
| 54 |
-
| 68b16354 | Reflect_h | Reflection |
|
| 55 |
-
| 6d0aefbc | MirrorTileH | Mirror |
|
| 56 |
-
| 6fa7a44f | MirrorTileV | Mirror |
|
| 57 |
-
| 746b3537 | tile_to_target | Tiling |
|
| 58 |
-
| 74dd1130 | Transpose | Transpose |
|
| 59 |
-
| 7b7f7511 | tile_to_target | Tiling |
|
| 60 |
-
| 8be77c9e | MirrorTileV | Mirror |
|
| 61 |
-
| 9172f3a0 | Upscale_3x | Upscale |
|
| 62 |
-
| 9dfd6313 | Transpose | Transpose |
|
| 63 |
-
| a416b8f3 | tile_to_target | Tiling |
|
| 64 |
-
| be94b721 | CropToContent | Crop |
|
| 65 |
-
| c59eb873 | Upscale_2x | Upscale |
|
| 66 |
-
| c9e6f938 | MirrorTileH | Mirror |
|
| 67 |
-
| d10ecb37 | tile_to_target | Tiling |
|
| 68 |
-
| d631b094 | ExtractLargestObject | Object |
|
| 69 |
-
| d9fac9be | tile_to_target | Tiling |
|
| 70 |
-
| de1cd16c | KeepSmallestObject | Object |
|
| 71 |
-
| ded97339 | overlay(ConnectSameColorV, ConnectSameColorH) | Stacker |
|
| 72 |
-
| eb5a1d5d | CompressGrid | Compress |
|
| 73 |
-
| ed36ccf7 | Rotate_90 | Rotation |
|
| 74 |
|
| 75 |
## Architecture
|
| 76 |
|
| 77 |
-
###
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
- **Upscale**: Upscale_2x, Upscale_3x
|
| 82 |
-
- **Stack**: StackH_3, StackV_3
|
| 83 |
-
- **Structural**: Transpose, CropToContent
|
| 84 |
-
- **Object**: ExtractLargest/Smallest/Unique/MostCommon, KeepLargest/Smallest, SortBySize
|
| 85 |
-
- **Fill/Connect**: FillInterior, ConnectSameColorH/V
|
| 86 |
-
- **Compress**: CompressGrid, RemoveBlackLines
|
| 87 |
-
- **Spatial**: ColorByProximity, DrawBorder
|
| 88 |
-
- **Symmetry**: Rotate_90/180/270, Reflect_h/v
|
| 89 |
-
- **Gravity**: GravityDown, GravityUp (optional)
|
| 90 |
-
- **Color**: InvertColors (optional)
|
| 91 |
|
| 92 |
-
###
|
| 93 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
|
| 95 |
-
###
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
"use_symmetry": true,
|
| 106 |
-
"use_gravity": true,
|
| 107 |
-
"use_color_ops": true,
|
| 108 |
-
"boundary_source": "target"
|
| 109 |
-
}
|
| 110 |
-
```
|
| 111 |
|
| 112 |
## How to reproduce
|
| 113 |
```bash
|
| 114 |
git clone https://github.com/fchollet/ARC-AGI.git /tmp/arc
|
| 115 |
cp -r /tmp/arc/data/training arc_data/training
|
| 116 |
-
python scripts/run_all_arc.py
|
|
|
|
| 117 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# PEMF Solver — ARC-AGI Training Set Evaluation
|
| 2 |
|
| 3 |
+
## Results (v3 — ITT physics + DSL beam)
|
| 4 |
|
| 5 |
+
| Metric | v1 | v2 | **v3** |
|
| 6 |
|---|---|---|---|
|
| 7 |
+
| **Tasks solved** | 31 (7.8%) | 40 (10.0%) | **47 (11.8%)** |
|
| 8 |
+
| via ITT | — | — | **16** |
|
| 9 |
+
| via DSL | 31 | 40 | **31** |
|
| 10 |
+
| Transforms (DSL) | 19 | 33 | 33 |
|
| 11 |
+
| ITT rule types | — | — | 7 |
|
| 12 |
+
| Total time | 17s | 51s | **36s** |
|
| 13 |
+
| Regressions | — | 0 | **0** |
|
| 14 |
|
| 15 |
+
## 7 Newly Solved Tasks (ITT-exclusive)
|
| 16 |
|
| 17 |
+
| Task ID | ITT Rule Type | What it does |
|
| 18 |
|---|---|---|
|
| 19 |
+
| 0d3d703e | recolor | Direct color substitution |
|
| 20 |
+
| 868de0fa | multi_region_fill | Frame size → fill color mapping |
|
| 21 |
+
| 8d5021e8 | tile | Tile with learned reflection pattern |
|
| 22 |
+
| b1948b0a | recolor | Color remap |
|
| 23 |
+
| c0f76784 | multi_region_fill | Frame size → fill color with fallback |
|
| 24 |
+
| c8f0f002 | recolor | Color substitution |
|
| 25 |
+
| d511f180 | recolor | Color remap |
|
|
|
|
|
|
|
| 26 |
|
| 27 |
+
**None of these 7 tasks were solvable by the DSL beam search.** They require analyzing training pairs *together* to learn the rule — the ITT engine does this via `TransformationRule.learn()`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
## Architecture
|
| 30 |
|
| 31 |
+
### ITT-first, DSL-fallback
|
| 32 |
+
1. **ITT solver** tries first: learns rule from training pairs using σ-analysis, ρ_q boundary charge, and field invariants
|
| 33 |
+
2. If ITT achieves σ=0 on ALL training pairs → use it (16 tasks)
|
| 34 |
+
3. Otherwise → **DSL beam search** with 33 transforms + greedy stacker (31 tasks)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
+
### ITT Physics Engine (`itt_engine.py`)
|
| 37 |
+
- **PhiField**: dual-field Φ_q (int semantics) + Φ̃ (smooth float operators)
|
| 38 |
+
- **ρ_q = |∇(∇²Φ̃)|**: boundary charge with physics-derived threshold (μ+1.5σ)
|
| 39 |
+
- **SigmaResidue**: classifies transformations as fill/expansion/compression/recolor/erase
|
| 40 |
+
- **Fan Signature**: 6-bit [Δ₁..Δ₆] task routing
|
| 41 |
+
- **TransformationRule.learn()**: learns tile_pattern, size_to_color, frame_to_fill, color_map, shape_to_color from training pairs
|
| 42 |
+
- **FieldInvariants**: enclosed mask, frame components, shape eigenspectrum, Fourier period detection
|
| 43 |
|
| 44 |
+
### ITT Rule Types
|
| 45 |
+
| Rule | Description |
|
| 46 |
+
|---|---|
|
| 47 |
+
| tile | Tile with learned per-block reflection/rotation pattern |
|
| 48 |
+
| self_tile | Kronecker: input nonzero mask as meta-layout |
|
| 49 |
+
| fill_enclosed | Fill enclosed regions via ρ_q boundary detection |
|
| 50 |
+
| multi_region_fill | Frame size → fill color with fallback chain |
|
| 51 |
+
| periodic_extension | Fourier period detection + color remap + tile |
|
| 52 |
+
| shape_indicator | Laplacian eigenspectrum → output color |
|
| 53 |
+
| recolor | Direct color substitution learned from pairs |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 54 |
|
| 55 |
## How to reproduce
|
| 56 |
```bash
|
| 57 |
git clone https://github.com/fchollet/ARC-AGI.git /tmp/arc
|
| 58 |
cp -r /tmp/arc/data/training arc_data/training
|
| 59 |
+
python scripts/run_all_arc.py # DSL only (40 tasks)
|
| 60 |
+
python scripts/run_all_arc_v3.py # ITT + DSL (47 tasks)
|
| 61 |
```
|
| 62 |
+
|
| 63 |
+
## References
|
| 64 |
+
- Original ITT solver: [Sensei-Intent-Tensor/0.0_ARC_AGI](https://github.com/Sensei-Intent-Tensor/0.0_ARC_AGI)
|
| 65 |
+
- ITT physics: [Sensei-Intent-Tensor/0.0._Executable_Physics](https://github.com/Sensei-Intent-Tensor/0.0._Executable_Physics)
|
| 66 |
+
- Zenodo: [https://zenodo.org/records/18077258](https://zenodo.org/records/18077258)
|