rogermt
/

neurogolf-solver

Model card Files Files and versions

xet

Community

rogermt commited on 16 days ago

Commit

d189f4f

verified ·

1 Parent(s): 99c34bc

Update LEARNING.md: PyTorch conv results, lstsq overfitting analysis

Browse files

Files changed (1) hide show

LEARNING.md +13 -0

LEARNING.md CHANGED Viewed

@@ -6,6 +6,7 @@
 | Version | Date | Tasks (arc-gen validated) | Est LB | Key Changes |
 |---------|------|--------------------------|--------|-------------|
 | v4.1 | 2026-04-24 | 50 | ~670 | Color map Gather for permutations (+15 pts) |
 | v4.0 | 2026-04-24 | 50 | ~656 | ARC-GEN validation, new analytical solvers, s_flip fix, static profiler, submission.csv |
 | v3 | 2026-04-24 | 307 (local) / ~40 (LB) | 501 | Added concat_enhanced, varshape_spatial_gather, conv_var_diff |
@@ -14,6 +15,18 @@
 ## Mistakes Log (DO NOT REPEAT)
 ### 2026-04-24: CuPy/GPU for lstsq — DOES NOT HELP
 - **What**: Swapped numpy→cupy to GPU-accelerate lstsq conv fitting
 - **Result**: GPU hit 90%, crashed on task 4 (OOM), fell back to CPU, same speed

 | Version | Date | Tasks (arc-gen validated) | Est LB | Key Changes |
 |---------|------|--------------------------|--------|-------------|
+| v4.2 | 2026-04-24 | 50 | ~670 | Added PyTorch learned conv (single+two-layer, multi-seed, ternary snap). Needs GPU. |
 | v4.1 | 2026-04-24 | 50 | ~670 | Color map Gather for permutations (+15 pts) |
 | v4.0 | 2026-04-24 | 50 | ~656 | ARC-GEN validation, new analytical solvers, s_flip fix, static profiler, submission.csv |
 | v3 | 2026-04-24 | 307 (local) / ~40 (LB) | 501 | Added concat_enhanced, varshape_spatial_gather, conv_var_diff |
 ## Mistakes Log (DO NOT REPEAT)
+### 2026-04-24: PyTorch 2-layer conv — fits training but doesn't generalize to arc-gen
+- **What**: Trained Conv→ReLU→Conv (hidden=32, ks=5,1) on train+test for task 12 (3 examples, 12×12)
+- **Result**: Train loss 8.65e-8 (perfect), train+test 3/3 pass, arc-gen 0/30 pass
+- **Root cause**: With only 3 training examples and 32×10×5×5 + 10×32×1×1 = 8320 parameters, the network memorizes the training examples without learning the underlying rule. This is exactly the same overfitting as lstsq.
+- **Fix attempted**: Include arc-gen examples in training data. Too slow on CPU (23 examples × 12×12 × 5000 steps). Needs GPU.
+- **Rule**: PyTorch conv is only useful if (a) trained on arc-gen data too, AND (b) run on GPU for speed. On CPU it's impractical — stick to lstsq which is at least fast.
+### 2026-04-24: Arc-gen in lstsq fitting exposes overfitting
+- **What**: Task 7 (7×7 grid) solved by lstsq at ks=7 with 4 base examples (P=[196×490], underdetermined). Adding 2 arc-gen examples (P=[294×490]) causes lstsq to FAIL.
+- **Root cause**: When rows < features, lstsq finds min-norm solution among infinite perfect fits. This solution happened to work on 4 training examples + 30 arc-gen by luck. Adding more constraints reveals the pattern can't be captured by ks=7 linear conv.
+- **Rule**: An lstsq fit that only works when underdetermined (rows < features) is likely overfitting. The arc-gen validation catches this correctly. Don't try to bypass it.
 ### 2026-04-24: CuPy/GPU for lstsq — DOES NOT HELP
 - **What**: Swapped numpy→cupy to GPU-accelerate lstsq conv fitting
 - **Result**: GPU hit 90%, crashed on task 4 (OOM), fell back to CPU, same speed