rogermt
/

neurogolf-solver

Model card Files Files and versions

xet

Community

rogermt commited on 11 days ago

Commit

1cc31ee

verified ·

1 Parent(s): 1b5636f

Move own-solver/PyTorch -to- ONNX.md to own-solver/

Browse files

Files changed (1) hide show

own-solver/PyTorch -to- ONNX.md +52 -0

own-solver/PyTorch -to- ONNX.md ADDED Viewed

	@@ -0,0 +1,52 @@

+Yes, absolutely viable — and it's probably the right move for the next batch of tasks. Here's the key tradeoff:
+Why it works
+PyTorch → ONNX export is straightforward for the ops we need:
+torch.onnx.export(model, dummy_input, "task.onnx", opset_version=10)
+This lets us use PyTorch's optimizer (Adam, SGD) to train small networks per-task via gradient descent instead of lstsq. That's strictly more powerful:
+lstsq (current)	PyTorch + gradient descent
+Architecture	Single conv layer only	Multi-layer, nonlinear (ReLU, etc.)
+Fitting	Closed-form, one-shot	Iterative, can fit complex patterns
+Failure mode	Underfits if pattern isn't linear	Can overfit (which is GOOD here — we WANT to memorize the training examples)
+Speed	Fast for small kernels	Slower, but GPU helps
+The 94 unsolved tasks fail because a single conv layer can't express the transformation. A 2-3 layer network with ReLU activations could.
+Why it's not trivial
+Opset 10 restrictions. Not every PyTorch op exports cleanly to opset 10. You need to verify the exported model actually runs. The main gotchas:
+torch.nn.Conv2d → Conv ✅ (opset 1)
+torch.relu → Relu ✅ (opset 1)
+torch.argmax → ArgMax ✅ (opset 1)
+torch.gather → Gather ✅ (opset 1)
+BUT: F.one_hot → OneHot ❌ (opset 9, but has CUDA issues — we already work around this with Equal+Cast)
+Dynamic shapes / control flow → ❌ forbidden
+So the pattern would be: build a small static PyTorch network, train it to perfectly fit the task's train+test examples, then export.
+The network must have fixed tensor shapes throughout (no data-dependent branching).
+What architecture to try
+For the 94 unsolved tasks, a 2-layer conv net would be the first thing:
+class TinyARC(nn.Module):
+    def __init__(self, hidden=32, ks=5):
+        super().__init__()
+        self.conv1 = nn.Conv2d(10, hidden, ks, padding=ks//2)
+        self.conv2 = nn.Conv2d(hidden, 10, ks, padding=ks//2)
+    def forward(self, x):  # x: [1,10,30,30]
+        mask = x.sum(dim=1, keepdim=True)  # [1,1,30,30] - valid pixel mask
+        h = torch.relu(self.conv1(x))
+        logits = self.conv2(h)             # [1,10,30,30]
+        out = (logits.argmax(dim=1, keepdim=True) == torch.arange(10).reshape(1,10,1,1).to(x.device)).float()
+        return out * mask
+Train with cross-entropy on the one-hot targets, export to ONNX. The argmax+equal+cast trick we already use would work in the export.
+The real blocker isn't PyTorch→ONNX
+It's the variable output shapes. 60+ of the 94 unsolved tasks have outputs that are a different size than the input, and that size varies across examples.
+No static ONNX graph can produce a 5×3 output for one input and a 7×9 output for another — the graph is fixed.
+For those, the only hope is that the conv learns to put valid content in the right region and zeros elsewhere
+(which our mask = ReduceSum(input) trick handles when output ⊆ input bounds). But when output is LARGER than input, we're stuck without a way to derive the output mask.
+Bottom line: Yes, build in PyTorch, export to ONNX. Focus the PyTorch solver on the ~10 same-shape tasks and ~7 fixed-diff-shape tasks where lstsq conv failed but a deeper network might succeed. The 77 variable-diff-shape tasks are a harder structural problem regardless of framework.