Chess model submitted to the LLM Course Chess Challenge.
Submission Info
- Submitted by: janisaiad
- Parameters: 485,040
- Organization: LLM-course
Model Details
- Architecture: Tiny Recursive Model (TRM) - looping recurrent transformer (cycle-shared weights)
- Vocab size: 148
- Embedding dim: 120
- Layers: 3
- Heads: 4
- Cycles: 8
TRM note: this is a looping TRM model — at inference/training time we run the same transformer stack for 8 recurrent refinement cycle(s) (weights are shared across cycles), which increases compute/reasoning depth without increasing parameter count.
Training Information
Training Metrics:
- Best Eval Loss: 0.64109
- Final Train Loss: 0.65280
- Total Epochs: 0.58
- Total Steps: 83,013
Training Loss Curves:
Note: Install matplotlib to generate loss curve plots
Training Loss History (Summary):
| Step | Epoch | Train Loss | Eval Loss | Learning Rate |
|---|---|---|---|---|
| 100 | 0.00 | 1.2521 | - | 1.19e-06 |
| 2,100 | 0.08 | 1.0120 | - | 2.53e-05 |
| 4,100 | 0.15 | 0.8552 | - | 4.94e-05 |
| 6,100 | 0.22 | 0.7592 | - | 4.88e-05 |
| 8,100 | 0.29 | 0.7181 | - | 4.75e-05 |
| 10,100 | 0.37 | 0.6869 | - | 4.62e-05 |
| 12,100 | 0.44 | 0.6819 | - | 4.50e-05 |
| 14,100 | 0.51 | 0.6705 | - | 4.37e-05 |
| 16,000 | 0.58 | - | 0.64109 | - |
- Downloads last month
- 2