Chess model submitted to the LLM Course Chess Challenge.

Submission Info

Submitted by: janisaiad
Parameters: 485,040
Organization: LLM-course

Model Details

Architecture: Tiny Recursive Model (TRM) - looping recurrent transformer (cycle-shared weights)
Vocab size: 148
Embedding dim: 120
Layers: 3
Heads: 4
Cycles: 8

TRM note: this is a looping TRM model — at inference/training time we run the same transformer stack for 8 recurrent refinement cycle(s) (weights are shared across cycles), which increases compute/reasoning depth without increasing parameter count.

Training Information

Training Metrics:

Best Eval Loss: 0.64109
Final Train Loss: 0.65280
Total Epochs: 0.58
Total Steps: 83,013

Training Loss Curves:

Note: Install matplotlib to generate loss curve plots

Training Loss History (Summary):

Step	Epoch	Train Loss	Eval Loss	Learning Rate
100	0.00	1.2521	-	1.19e-06
2,100	0.08	1.0120	-	2.53e-05
4,100	0.15	0.8552	-	4.94e-05
6,100	0.22	0.7592	-	4.88e-05
8,100	0.29	0.7181	-	4.75e-05
10,100	0.37	0.6869	-	4.62e-05
12,100	0.44	0.6819	-	4.50e-05
14,100	0.51	0.6705	-	4.37e-05
16,000	0.58	-	0.64109	-

Downloads last month: 2

Safetensors

Model size

485k params

Tensor type

F32