rogermt commited on
Commit
eabdff6
·
verified ·
1 Parent(s): 641e63d

Upload LEARNING.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. LEARNING.md +5 -5
LEARNING.md CHANGED
@@ -33,11 +33,11 @@
33
  - **Root cause**: Applied theory from papers without understanding the empirical regime. ARC tasks have only a few active colors → patch covariance has few dominant eigenvalues → noise concentrates in low-rank directions → catastrophic, not benign, overfitting. LEARNING.md "Benign Overfitting Theory" section explicitly states this but agent ignored it while writing the code.
34
  - **Rule**: Theory from papers is NOT proof for our specific data. Run A/B experiments: with vs without feature on same tasks, measure arc-gen survival rate. Only keep features that show >10% improvement on a test set. If LEARNING.md says a regime is "catastrophic", do not write code that assumes "benign".
35
 
36
- ### 2026-04-25: Agent ignored the #1 competitive insight from LEARNING.md — BLENDING
37
- - **What**: LEARNING.md Competitive Intelligence section clearly states: "The top notebooks are BLENDERS, not solvers." Top notebook (4200-v5) solves 341 tasks by blending 5 ZIP sources + 5 manual LLM rescue. Our solver v4: 50 tasks. Yet agent focused entirely on solver improvements and wrote zero blending code.
38
- - **Result**: Zero blending code written. Zero exploration of Kaggle public datasets. Continued optimizing ~50-task solver instead of building 300+ task blend pipeline. Target is 4800+ LB; our path is stuck at ~670.
39
- - **Root cause**: Did not read the full LEARNING.md before planning. Did not understand that 4000+ LB requires ~300+ tasks solved, and our solver alone cannot reach that.
40
- - **Rule**: ALWAYS read the full LEARNING.md before starting work. If the analysis says "blending is the meta-game", start with blending. Do NOT ignore empirical competitive intelligence. The TODO.md "Blend Pipeline" section exists for a reason.
41
 
42
  ### 2026-04-25: Agent's composition detectors (rotate+color, flip+color, transpose+color) are untested
43
  - **What**: Wrote s_composition_rotate_color, s_composition_flip_color, s_composition_transpose_color with complex ONNX graph chaining code (~150 lines)
 
33
  - **Root cause**: Applied theory from papers without understanding the empirical regime. ARC tasks have only a few active colors → patch covariance has few dominant eigenvalues → noise concentrates in low-rank directions → catastrophic, not benign, overfitting. LEARNING.md "Benign Overfitting Theory" section explicitly states this but agent ignored it while writing the code.
34
  - **Rule**: Theory from papers is NOT proof for our specific data. Run A/B experiments: with vs without feature on same tasks, measure arc-gen survival rate. Only keep features that show >10% improvement on a test set. If LEARNING.md says a regime is "catastrophic", do not write code that assumes "benign".
35
 
36
+ ### 2026-04-25: Agent misrepresented user's intent in LEARNING.md — BLENDING is NOT the user's strategy
37
+ - **What**: Added a mistakes log entry claiming "Agent ignored blending" and wrote "start with blending" as a rule. The user explicitly stated: "this will not be done ... i am writing my own models no blending ... this is major flaw in the competition loophole"
38
+ - **Result**: LEARNING.md now contains a rule that contradicts the user's competitive philosophy. If a future agent reads this, they will be told to implement blending — the exact opposite of what the user wants. The LEARNING.md file itself became misleading.
39
+ - **Root cause**: Agent confused "competitive intelligence" (what others do) with "user's strategy" (what we should do). The LEARNING.md Competitive Intelligence section is for awareness, not instruction. User wants to win on solver merit, not loopholes.
40
+ - **Rule**: LEARNING.md must reflect the USER'S strategy, not the competition's meta. If user says "no blending", that is the rule. Competitive intelligence goes in a separate "What others do" section, never in "Rules" or "Mistakes". Update LEARNING.md to separate "our approach" from "market intelligence".
41
 
42
  ### 2026-04-25: Agent's composition detectors (rotate+color, flip+color, transpose+color) are untested
43
  - **What**: Wrote s_composition_rotate_color, s_composition_flip_color, s_composition_transpose_color with complex ONNX graph chaining code (~150 lines)