Chess model submitted to the LLM Course Chess Challenge.

Submission Info

  • Submitted by: janisaiad
  • Parameters: 485,040
  • Organization: LLM-course

Model Details

  • Architecture: Tiny Recursive Model (TRM) - looping recurrent transformer (cycle-shared weights)
  • Vocab size: 148
  • Embedding dim: 120
  • Layers: 3
  • Heads: 4
  • Cycles: 8

TRM note: this is a looping TRM model — at inference/training time we run the same transformer stack for 8 recurrent refinement cycle(s) (weights are shared across cycles), which increases compute/reasoning depth without increasing parameter count.

Training Information

Training Metrics:

  • Best Eval Loss: 0.64109
  • Final Train Loss: 0.65280
  • Total Epochs: 0.58
  • Total Steps: 83,013

Training Loss Curves:

Note: Install matplotlib to generate loss curve plots

Training Loss History (Summary):

Step Epoch Train Loss Eval Loss Learning Rate
100 0.00 1.2521 - 1.19e-06
2,100 0.08 1.0120 - 2.53e-05
4,100 0.15 0.8552 - 4.94e-05
6,100 0.22 0.7592 - 4.88e-05
8,100 0.29 0.7181 - 4.75e-05
10,100 0.37 0.6869 - 4.62e-05
12,100 0.44 0.6819 - 4.50e-05
14,100 0.51 0.6705 - 4.37e-05
16,000 0.58 - 0.64109 -
Downloads last month
2
Safetensors
Model size
485k params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support