Update SKILL.md: v5.2 structure, new solvers, no excluded tasks, current scores
Browse files
SKILL.md
CHANGED
|
@@ -32,8 +32,8 @@ Research β Design β Experiment β Analyze β Research β ...
|
|
| 32 |
## Quick Reference
|
| 33 |
|
| 34 |
- **Repo**: `rogermt/neurogolf-solver`
|
| 35 |
-
- **Current version**: v5 β
|
| 36 |
-
- **Previous best**: v4.3 β 50 arc-gen-validated tasks, est LB ~670
|
| 37 |
- **Kaggle runtime**: 12 hours for submission
|
| 38 |
- **Target**: 3000+ LB (our own solver, no blending)
|
| 39 |
- **Detailed history, mistakes, analysis**: see `LEARNING.md`
|
|
@@ -48,7 +48,7 @@ Research β Design β Experiment β Analyze β Research β ...
|
|
| 48 |
| Max file size | 1.44 MB per model |
|
| 49 |
| Banned ops | Loop, Scan, NonZero, Unique, Script, Function |
|
| 50 |
| Scoring | `max(1.0, 25.0 - ln(MACs + memory + params))` per task |
|
| 51 |
-
|
|
| 52 |
| Validation | Models checked against **train + test + arc-gen** (ALL splits) |
|
| 53 |
| Submission | `submission.zip` with `task001.onnx`β`task400.onnx` + optional `submission.csv` |
|
| 54 |
|
|
@@ -64,10 +64,10 @@ Research β Design β Experiment β Analyze β Research β ...
|
|
| 64 |
|
| 65 |
## 3. Architecture
|
| 66 |
|
| 67 |
-
### Package Structure (v5)
|
| 68 |
```
|
| 69 |
neurogolf_solver/
|
| 70 |
-
βββ constants.py # Grid dims, opset, excluded tasks
|
| 71 |
βββ config.py # Runtime providers, opset factory
|
| 72 |
βββ data_loader.py # Task loading, one-hot, example extraction
|
| 73 |
βββ validators.py # Model validation against all splits
|
|
@@ -78,9 +78,12 @@ neurogolf_solver/
|
|
| 78 |
βββ main.py # Entry point with argparse
|
| 79 |
βββ solvers/
|
| 80 |
βββ analytical.py # identity, constant, color_map, transpose
|
| 81 |
-
βββ geometric.py # flip, rotate, shift, crop, gravity
|
| 82 |
βββ tiling.py # tile, upscale, mirror, concat, spatial_gather
|
| 83 |
-
βββ conv.py # lstsq conv (fixed, variable, diffshape, var_diff)
|
|
|
|
|
|
|
|
|
|
| 84 |
βββ solver_registry.py # ANALYTICAL_SOLVERS list + solve_task()
|
| 85 |
```
|
| 86 |
|
|
@@ -92,13 +95,14 @@ Run with: `python -m neurogolf_solver.main [args]`
|
|
| 92 |
identity β constant β color_map β transpose β flip β rotate β
|
| 93 |
shift β tile β upscale β kronecker β nonuniform_scale β
|
| 94 |
mirror_h β mirror_v β quad_mirror β concat β concat_enhanced β
|
| 95 |
-
diagonal_tile β fixed_crop β spatial_gather β varshape_spatial_gather
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
|
|
|
| 102 |
```
|
| 103 |
|
| 104 |
### ONNX Building Rules (opset 17)
|
|
@@ -111,59 +115,69 @@ Run with: `python -m neurogolf_solver.main [args]`
|
|
| 111 |
- **ReduceSum** with axes as **tensor input** (opset 13+ requirement)
|
| 112 |
- **Pad** with tensor-based `pads` input (opset 11+ requirement)
|
| 113 |
- **lstsq calls** must be wrapped in `try/except (LinAlgError, ValueError)` β SVD can fail to converge
|
|
|
|
| 114 |
|
| 115 |
-
### Conv Fitting
|
| 116 |
-
|
| 117 |
-
**We solve 307 locally but only ~50 survive arc-gen. This is CATASTROPHIC overfitting.**
|
| 118 |
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
- **Double descent**: ks=5,7,9 are at/near interpolation threshold where test error PEAKS
|
| 122 |
|
| 123 |
-
**Current fitting strategy (v5):**
|
| 124 |
-
-
|
|
|
|
| 125 |
- Kernel sizes: [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29]
|
| 126 |
- Try no-bias first, then bias
|
| 127 |
- lstsq wrapped in try/except for SVD non-convergence
|
| 128 |
- **Validate against arc-gen BEFORE accepting** β reject if fails
|
| 129 |
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
|
| 135 |
-
**
|
| 136 |
-
-
|
| 137 |
-
-
|
| 138 |
-
-
|
| 139 |
-
- π² Gradient descent with early stopping β implicit regularization, don't interpolate
|
| 140 |
|
| 141 |
## 4. Performance
|
| 142 |
|
| 143 |
-
**The lstsq conv solver is the speed bottleneck.** Use `--conv_budget` to cap time per task (
|
| 144 |
|
| 145 |
**Do NOT** try to GPU-accelerate lstsq. The bottleneck is algorithmic (O(nΒ³) SVD), not device.
|
| 146 |
|
| 147 |
-
## 5. Score Accounting
|
| 148 |
|
| 149 |
-
| Category | Tasks
|
| 150 |
-
|----------|-------
|
| 151 |
-
| Analytical
|
| 152 |
-
| Conv (
|
| 153 |
-
|
|
| 154 |
-
|
|
| 155 |
-
|
|
|
|
|
|
|
|
| 156 |
|
| 157 |
### Path to 3000+
|
| 158 |
-
1. β
ARC-GEN validation (v4
|
| 159 |
-
2. β
New analytical solvers
|
| 160 |
-
3. β
|
| 161 |
-
4. β
|
| 162 |
-
5. β
|
| 163 |
-
6. β
|
| 164 |
-
7.
|
| 165 |
-
8. π² **
|
| 166 |
-
9. π² **
|
|
|
|
| 167 |
|
| 168 |
**Blending is EXPLICITLY excluded** β user's competitive philosophy.
|
| 169 |
|
|
@@ -171,7 +185,7 @@ Run with: `python -m neurogolf_solver.main [args]`
|
|
| 171 |
|
| 172 |
Before submitting to Kaggle:
|
| 173 |
- [ ] All models validated against train + test + arc-gen (locally)
|
| 174 |
-
- [ ]
|
| 175 |
- [ ] No GatherElements in any model
|
| 176 |
- [ ] No banned ops
|
| 177 |
- [ ] Each .onnx < 1.44 MB
|
|
@@ -184,7 +198,7 @@ Before submitting to Kaggle:
|
|
| 184 |
| Location | Path | Notes |
|
| 185 |
|----------|------|-------|
|
| 186 |
| HF Repo | `rogermt/neurogolf-solver` | All code + data |
|
| 187 |
-
| **Solver package** | `neurogolf_solver/` | **v5 β
|
| 188 |
| Legacy monolith | `neurogolf_solver.py` | v4, kept for reference β do not edit |
|
| 189 |
| Official utils | `neurogolf_utils.py` | Kaggle scoring lib (needs onnx_tool) |
|
| 190 |
| ARC-GEN data | `ARC-GEN-100K.zip` | 400 files, 100K examples |
|
|
|
|
| 32 |
## Quick Reference
|
| 33 |
|
| 34 |
- **Repo**: `rogermt/neurogolf-solver`
|
| 35 |
+
- **Current version**: v5.2 β 52 solved, ~710 score, est LB ~1058
|
| 36 |
+
- **Previous best on Kaggle**: v4.3 β 50 arc-gen-validated tasks, est LB ~670
|
| 37 |
- **Kaggle runtime**: 12 hours for submission
|
| 38 |
- **Target**: 3000+ LB (our own solver, no blending)
|
| 39 |
- **Detailed history, mistakes, analysis**: see `LEARNING.md`
|
|
|
|
| 48 |
| Max file size | 1.44 MB per model |
|
| 49 |
| Banned ops | Loop, Scan, NonZero, Unique, Script, Function |
|
| 50 |
| Scoring | `max(1.0, 25.0 - ln(MACs + memory + params))` per task |
|
| 51 |
+
| Tasks | **All 400 count. There are NO excluded tasks.** |
|
| 52 |
| Validation | Models checked against **train + test + arc-gen** (ALL splits) |
|
| 53 |
| Submission | `submission.zip` with `task001.onnx`β`task400.onnx` + optional `submission.csv` |
|
| 54 |
|
|
|
|
| 64 |
|
| 65 |
## 3. Architecture
|
| 66 |
|
| 67 |
+
### Package Structure (v5.2)
|
| 68 |
```
|
| 69 |
neurogolf_solver/
|
| 70 |
+
βββ constants.py # Grid dims, opset, limits (NO excluded tasks)
|
| 71 |
βββ config.py # Runtime providers, opset factory
|
| 72 |
βββ data_loader.py # Task loading, one-hot, example extraction
|
| 73 |
βββ validators.py # Model validation against all splits
|
|
|
|
| 78 |
βββ main.py # Entry point with argparse
|
| 79 |
βββ solvers/
|
| 80 |
βββ analytical.py # identity, constant, color_map, transpose
|
| 81 |
+
βββ geometric.py # flip, rotate, shift, crop, gravity (detect only)
|
| 82 |
βββ tiling.py # tile, upscale, mirror, concat, spatial_gather
|
| 83 |
+
βββ conv.py # lstsq conv (fixed, variable, diffshape, var_diff) + PCR fallback
|
| 84 |
+
βββ gravity.py # Unrolled bubble-sort gravity (Conv+Where, 4 dirs) β Task 78
|
| 85 |
+
βββ edge.py # Laplacian edge detection (0 matches currently)
|
| 86 |
+
βββ mode.py # Mode fill (ReduceSumβArgMaxβExpand) β Task 129
|
| 87 |
βββ solver_registry.py # ANALYTICAL_SOLVERS list + solve_task()
|
| 88 |
```
|
| 89 |
|
|
|
|
| 95 |
identity β constant β color_map β transpose β flip β rotate β
|
| 96 |
shift β tile β upscale β kronecker β nonuniform_scale β
|
| 97 |
mirror_h β mirror_v β quad_mirror β concat β concat_enhanced β
|
| 98 |
+
diagonal_tile β fixed_crop β spatial_gather β varshape_spatial_gather β
|
| 99 |
+
gravity_unrolled β edge_detect β mode_fill
|
| 100 |
+
|
| 101 |
+
2. Conv solvers (lstsq fitted, validated against arc-gen, PCR fallback):
|
| 102 |
+
conv_fixed β SliceβConvβArgMaxβEqual+CastβPad
|
| 103 |
+
conv_variable β Conv(30Γ30)βArgMaxβEqual+CastβMul(mask)
|
| 104 |
+
conv_diffshape β SliceβConvβSlice(crop)βArgMaxβEqual+CastβPad
|
| 105 |
+
conv_var_diff β Conv(30Γ30)βArgMaxβEqual+CastβMul(input_mask)
|
| 106 |
```
|
| 107 |
|
| 108 |
### ONNX Building Rules (opset 17)
|
|
|
|
| 115 |
- **ReduceSum** with axes as **tensor input** (opset 13+ requirement)
|
| 116 |
- **Pad** with tensor-based `pads` input (opset 11+ requirement)
|
| 117 |
- **lstsq calls** must be wrapped in `try/except (LinAlgError, ValueError)` β SVD can fail to converge
|
| 118 |
+
- **ArgMax + Equal+Cast** before Pad to ensure clean one-hot in padded region (gravity solver lesson)
|
| 119 |
|
| 120 |
+
### Conv Fitting
|
|
|
|
|
|
|
| 121 |
|
| 122 |
+
**Conv ceiling: ~25 tasks.** Regularization (Ridge, PCA/SVD, skip-ks) all tested and rejected.
|
| 123 |
+
Root cause: architecture mismatch β most unsolved tasks need non-local ops, not local conv patches.
|
|
|
|
| 124 |
|
| 125 |
+
**Current fitting strategy (v5.1+):**
|
| 126 |
+
- Composable primitives: `_build_patch_matrix` + `_solve_weights` + `_extract_weights`
|
| 127 |
+
- PCR fallback via `_solve_weights_pcr` (deferred 2nd pass, 0 new solves but no regressions)
|
| 128 |
- Kernel sizes: [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29]
|
| 129 |
- Try no-bias first, then bias
|
| 130 |
- lstsq wrapped in try/except for SVD non-convergence
|
| 131 |
- **Validate against arc-gen BEFORE accepting** β reject if fails
|
| 132 |
|
| 133 |
+
### New Solver Architectures (v5.2)
|
| 134 |
+
|
| 135 |
+
**gravity.py** β Unrolled bubble-sort via Conv+Where
|
| 136 |
+
- 4 directions Γ 10 bg colors, max(IH,IW) steps
|
| 137 |
+
- Per step: 2Γ Conv(3Γ3 shift), 3Γ ReduceSum, 3Γ Greater, 2Γ And, 2Γ Where
|
| 138 |
+
- Final: ArgMax + Equal+Cast + Pad (clean one-hot)
|
| 139 |
+
- Cost: ~16M (10Γ10 grid), score ~8.4
|
| 140 |
+
- **Validated: Task 78 (direction=up, bg=0)**
|
| 141 |
+
|
| 142 |
+
**edge.py** β Laplacian conv boundary detection
|
| 143 |
+
- Conv 1Γ1 (channel collapse) β Conv 3Γ3 (Laplacian) β Abs β Greater β And β Where
|
| 144 |
+
- Cost: ~16K MACs, score ~15
|
| 145 |
+
- **0 matches currently** β edge definition may be too strict
|
| 146 |
|
| 147 |
+
**mode.py** β Global majority color fill
|
| 148 |
+
- Slice β ReduceSum(axes=[2,3]) β ArgMax β Equal+Cast β Expand β Pad
|
| 149 |
+
- Cost: ~2K, score ~19.5
|
| 150 |
+
- **Validated: Task 129**
|
|
|
|
| 151 |
|
| 152 |
## 4. Performance
|
| 153 |
|
| 154 |
+
**The lstsq conv solver is the speed bottleneck.** Use `--conv_budget` to cap time per task (5s locally, 60s on Kaggle).
|
| 155 |
|
| 156 |
**Do NOT** try to GPU-accelerate lstsq. The bottleneck is algorithmic (O(nΒ³) SVD), not device.
|
| 157 |
|
| 158 |
+
## 5. Score Accounting (v5.2)
|
| 159 |
|
| 160 |
+
| Category | Tasks | Avg Score | Notes |
|
| 161 |
+
|----------|-------|-----------|-------|
|
| 162 |
+
| Analytical | 24 | ~16 | identity, constant, color_map, transpose, flip, rotate, shift, tile, mirrors, etc. |
|
| 163 |
+
| Conv (lstsq) | 25 | ~10.5 | conv_fixed, conv_var, conv_diff, conv_var_diff |
|
| 164 |
+
| Gravity | 1 | 8.4 | Task 78 |
|
| 165 |
+
| Mode fill | 1 | 19.5 | Task 129 |
|
| 166 |
+
| Timing artifact | 1 | 8.2 | Task 61 (conv_var, only on slow hardware) |
|
| 167 |
+
| **Unsolved** | **348** | **1.0** | Minimum score |
|
| 168 |
+
| **Total** | **52/400** | | **~710 solved + 348 = ~1058 est LB** |
|
| 169 |
|
| 170 |
### Path to 3000+
|
| 171 |
+
1. β
ARC-GEN validation (v4)
|
| 172 |
+
2. β
New analytical solvers (v4)
|
| 173 |
+
3. β
Opset 17 Slice-based transforms (v5)
|
| 174 |
+
4. β
lstsq crash fix + modular package (v5)
|
| 175 |
+
5. β
PCR fallback in conv (v5.1 β 0 new solves but clean code)
|
| 176 |
+
6. β
Gravity solver (v5.2 β Task 78)
|
| 177 |
+
7. β
Mode fill solver (v5.2 β Task 129)
|
| 178 |
+
8. π² **Phase 3 solvers**: flood fill, composition, color LUT, CumSum β see TODO.md
|
| 179 |
+
9. π² **Phase 1a**: Opset 17 conversions for existing analytical tasks (score optimization)
|
| 180 |
+
10. π² **Phase 4**: ONNX optimizer, best-of-N selection
|
| 181 |
|
| 182 |
**Blending is EXPLICITLY excluded** β user's competitive philosophy.
|
| 183 |
|
|
|
|
| 185 |
|
| 186 |
Before submitting to Kaggle:
|
| 187 |
- [ ] All models validated against train + test + arc-gen (locally)
|
| 188 |
+
- [ ] **All 400 tasks attempted** (no exclusions)
|
| 189 |
- [ ] No GatherElements in any model
|
| 190 |
- [ ] No banned ops
|
| 191 |
- [ ] Each .onnx < 1.44 MB
|
|
|
|
| 198 |
| Location | Path | Notes |
|
| 199 |
|----------|------|-------|
|
| 200 |
| HF Repo | `rogermt/neurogolf-solver` | All code + data |
|
| 201 |
+
| **Solver package** | `neurogolf_solver/` | **v5.2 β 19 files, modular** |
|
| 202 |
| Legacy monolith | `neurogolf_solver.py` | v4, kept for reference β do not edit |
|
| 203 |
| Official utils | `neurogolf_utils.py` | Kaggle scoring lib (needs onnx_tool) |
|
| 204 |
| ARC-GEN data | `ARC-GEN-100K.zip` | 400 files, 100K examples |
|