rogermt commited on
Commit
d189f4f
·
verified ·
1 Parent(s): 99c34bc

Update LEARNING.md: PyTorch conv results, lstsq overfitting analysis

Browse files
Files changed (1) hide show
  1. LEARNING.md +13 -0
LEARNING.md CHANGED
@@ -6,6 +6,7 @@
6
 
7
  | Version | Date | Tasks (arc-gen validated) | Est LB | Key Changes |
8
  |---------|------|--------------------------|--------|-------------|
 
9
  | v4.1 | 2026-04-24 | 50 | ~670 | Color map Gather for permutations (+15 pts) |
10
  | v4.0 | 2026-04-24 | 50 | ~656 | ARC-GEN validation, new analytical solvers, s_flip fix, static profiler, submission.csv |
11
  | v3 | 2026-04-24 | 307 (local) / ~40 (LB) | 501 | Added concat_enhanced, varshape_spatial_gather, conv_var_diff |
@@ -14,6 +15,18 @@
14
 
15
  ## Mistakes Log (DO NOT REPEAT)
16
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ### 2026-04-24: CuPy/GPU for lstsq — DOES NOT HELP
18
  - **What**: Swapped numpy→cupy to GPU-accelerate lstsq conv fitting
19
  - **Result**: GPU hit 90%, crashed on task 4 (OOM), fell back to CPU, same speed
 
6
 
7
  | Version | Date | Tasks (arc-gen validated) | Est LB | Key Changes |
8
  |---------|------|--------------------------|--------|-------------|
9
+ | v4.2 | 2026-04-24 | 50 | ~670 | Added PyTorch learned conv (single+two-layer, multi-seed, ternary snap). Needs GPU. |
10
  | v4.1 | 2026-04-24 | 50 | ~670 | Color map Gather for permutations (+15 pts) |
11
  | v4.0 | 2026-04-24 | 50 | ~656 | ARC-GEN validation, new analytical solvers, s_flip fix, static profiler, submission.csv |
12
  | v3 | 2026-04-24 | 307 (local) / ~40 (LB) | 501 | Added concat_enhanced, varshape_spatial_gather, conv_var_diff |
 
15
 
16
  ## Mistakes Log (DO NOT REPEAT)
17
 
18
+ ### 2026-04-24: PyTorch 2-layer conv — fits training but doesn't generalize to arc-gen
19
+ - **What**: Trained Conv→ReLU→Conv (hidden=32, ks=5,1) on train+test for task 12 (3 examples, 12×12)
20
+ - **Result**: Train loss 8.65e-8 (perfect), train+test 3/3 pass, arc-gen 0/30 pass
21
+ - **Root cause**: With only 3 training examples and 32×10×5×5 + 10×32×1×1 = 8320 parameters, the network memorizes the training examples without learning the underlying rule. This is exactly the same overfitting as lstsq.
22
+ - **Fix attempted**: Include arc-gen examples in training data. Too slow on CPU (23 examples × 12×12 × 5000 steps). Needs GPU.
23
+ - **Rule**: PyTorch conv is only useful if (a) trained on arc-gen data too, AND (b) run on GPU for speed. On CPU it's impractical — stick to lstsq which is at least fast.
24
+
25
+ ### 2026-04-24: Arc-gen in lstsq fitting exposes overfitting
26
+ - **What**: Task 7 (7×7 grid) solved by lstsq at ks=7 with 4 base examples (P=[196×490], underdetermined). Adding 2 arc-gen examples (P=[294×490]) causes lstsq to FAIL.
27
+ - **Root cause**: When rows < features, lstsq finds min-norm solution among infinite perfect fits. This solution happened to work on 4 training examples + 30 arc-gen by luck. Adding more constraints reveals the pattern can't be captured by ks=7 linear conv.
28
+ - **Rule**: An lstsq fit that only works when underdetermined (rows < features) is likely overfitting. The arc-gen validation catches this correctly. Don't try to bypass it.
29
+
30
  ### 2026-04-24: CuPy/GPU for lstsq — DOES NOT HELP
31
  - **What**: Swapped numpy→cupy to GPU-accelerate lstsq conv fitting
32
  - **Result**: GPU hit 90%, crashed on task 4 (OOM), fell back to CPU, same speed