File size: 1,884 Bytes
826a998 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | # ARC-AGI TRM Solver — Roadmap
> Focus: TRM (Tiny Recursive Model) + LLM agent routing
> Updated: 2026-05-03
## Current Status
- neurogolf-solver: 52/400 tasks, LB 594.84 (separate repo)
- TRM solver: research complete, implementation starting
- LLM classifier: code written (classify_tasks.py)
## Files
| File | Purpose |
|------|---------|
| TRM_RESEARCH.md | Paper findings, architecture, NeuroGolf constraints |
| classify_tasks.py | DeepSeek API classifier, runs on Kaggle |
| composition.py | Composition solvers (transform+recolor, etc) |
| SKILLS/kilo-agent/ | Kilo CLI reference docs (skill files, not project code) |
## Phase 1: LLM Routing (current)
- [x] Write classify_tasks.py (DeepSeek API classifier)
- [x] Write composition.py (C1/C2/C3 composition solvers)
- [ ] Test classifier on Kaggle — does DeepSeek pick correct solvers?
- [ ] Integrate routing into neurogolf-solver solve_task()
- [ ] Measure: how many new tasks does routing unlock?
## Phase 2: Tiny TRM
- [ ] Adapt official TRM code (wtfmahe/Samsung-TRM) for hidden=64
- [ ] Change encoding: flat tokens [1,916] -> one-hot [1,10,30,30]
- [ ] Unroll recursion (replace ACT with fixed step count)
- [ ] Remove banned ops (Loop, Scan)
- [ ] Train on ARC-AGI + augmentations (single A10G)
- [ ] Validate against arc-gen
- [ ] Export to ONNX within 1.44MB limit
- [ ] Evaluate: how many of the 348 unsolved tasks does it crack?
## Phase 3: Integration
- [ ] Combine LLM routing + tiny TRM + analytical solvers
- [ ] Full 400-task arc-gen validation
- [ ] Kaggle submission
## Key Constraints (NeuroGolf)
- Input/Output: float32 [1,10,30,30] one-hot
- Max 1.44 MB per ONNX file
- Banned ops: Loop, Scan, NonZero, Unique, Script, Function
- All 400 tasks count, none excluded
- Scoring: max(1.0, 25.0 - ln(MACs + memory + params))
- LLM agent cost does NOT count (offline during model generation)
|