rogermt
/

ARC-AGI

Model card Files Files and versions

xet

Community

rogermt commited on 5 days ago

Commit

0e63a4d

verified ·

1 Parent(s): 981ef11

Add TRM research findings and architecture notes

Browse files

Files changed (1) hide show

trm_solver/TRM_RESEARCH.md +103 -0

trm_solver/TRM_RESEARCH.md ADDED Viewed

	@@ -0,0 +1,103 @@

+# TRM Research Notes (from arxiv:2510.04871 + official code wtfmahe/Samsung-TRM)
+## What TRM Is
+Tiny Recursive Model = 2-layer transformer that recurses on itself.
+Single network does both reasoning (z_L) and answer (z_H) updates.
+Full TRM specs:
+- hidden=512, 8 heads, SwiGLU expansion=4
+- 2 layers only, recursed T=3 outer x n=4 inner = ~15 passes
+- 7M params total
+- ACT (Adaptive Compute Time) for dynamic halting
+- EMA (Exponential Moving Average) weight updates, 0.999
+- RoPE position encoding
+- Puzzle embedding: 16 learned tokens per task
+Training:
+- 100K epochs, lr=1e-4, batch=768
+- 1000 augmentations per task (color permute + dihedral + translate)
+- Data: ARC-AGI training + evaluation + ConceptARC
+- 3 days on 4xH100
+- Loss: stablemax cross-entropy
+Results:
+- 45% ARC-AGI-1 (beats DeepSeek R1 at 15.8%, o3-mini at 34.5%)
+- 8% ARC-AGI-2
+- 7M params vs 671B for DeepSeek R1
+## Encoding
+ARC grids encoded as flat sequences:
+- vocab_size = 12: 0=PAD, 1=EOS, 2-11=colors 0-9
+- Grid flattened to 900 tokens (30x30)
+- EOS marks grid boundary
+- Translational augmentation: random padding offsets
+## Algorithm (simplified)
+```python
+def trm_forward(x, z_L, z_H, net, n=4, T=3):
+    x = embed(x) + puzzle_emb
+    # T-1 outer cycles without grad (improve initialization)
+    for _ in range(T - 1):
+        for _ in range(n):
+            z_L = net(z_L, z_H + x)  # update reasoning
+        z_H = net(z_H, z_L)          # update answer
+    # 1 outer cycle with grad
+    for _ in range(n):
+        z_L = net(z_L, z_H + x)
+    z_H = net(z_H, z_L)
+    output = lm_head(z_H)
+    return output
+```
+## NeuroGolf Constraints
+| Constraint | Value |
+|-----------|-------|
+| Input/Output | float32 [1,10,30,30] one-hot |
+| Max file size | 1.44 MB per ONNX |
+| Banned ops | Loop, Scan, NonZero, Unique, Script, Function |
+| Scoring | max(1.0, 25.0 - ln(MACs + memory + params)) |
+## Full TRM Cannot Fit
+7M params = ~28MB ONNX. Limit is 1.44MB. 20x over.
+Loops are BANNED in NeuroGolf, so ACT and recursion must be unrolled.
+Unrolling 15 passes of 2-layer transformer = 30 effective layers.
+## Tiny TRM Configs That Fit
+| Config | Params | ONNX Size | Recursions | Est Score/Task |
+|--------|--------|-----------|------------|----------------|
+| hidden=64, 4 heads, 2 layers, 4 recursions | ~42K | ~170KB | 4 | ~12.5 |
+| hidden=64, 4 heads, 2 layers, 8 recursions | ~85K | ~340KB | 8 | ~11.6 |
+| hidden=128, 4 heads, 2 layers, 4 recursions | ~170K | ~680KB | 4 | ~11.0 |
+| hidden=128, 4 heads, 2 layers, 8 recursions | ~340K | ~1.4MB | 8 | barely fits |
+## LLM Agent Integration
+The LLM (DeepSeek) is used OFFLINE during model generation.
+It does NOT go into the ONNX file.
+It classifies tasks and routes to the correct solver.
+Zero cost impact on the submitted ONNX models.
+Architecture:
+ARC task -> DeepSeek API (classify) -> route to solver -> build ONNX -> validate -> submit
+## Official Code
+- Repo: wtfmahe/Samsung-TRM on HuggingFace
+- GitHub: Kilo-Org/kilocode (for Kilo CLI)
+- Key files: models/recursive_reasoning/trm.py, dataset/build_arc_dataset.py, config/arch/trm.yaml
+- Dataset builder handles: ARC-AGI + ConceptARC, 1000 augmentations, color/dihedral/translation
+## Next Steps
+1. Build tiny TRM (hidden=64) - implement from official code, adapt encoding
+2. Train on ARC data (single A10G, ~few hours)
+3. Evaluate on unsolved tasks (the 348 that analytical solvers can't handle)
+4. Export to ONNX within NeuroGolf constraints
+5. Integrate with LLM classifier for routing