File size: 2,124 Bytes
c15b6b0 5fa4515 f0c113d 5d8e6d3 5fa4515 5d8e6d3 5fa4515 5d8e6d3 5fa4515 5d8e6d3 5fa4515 5d8e6d3 5fa4515 5d8e6d3 5fa4515 5d8e6d3 5fa4515 8bb2e18 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | ---
license: mit
language:
- en
pipeline_tag: text-generation
tags:
- chess
- puzzles
- chess-games
- stockfish
- fen
- best-move
- uci
- san
- text-generation-inference
datasets:
- ethanjtang/GAMBIT-stockfish18-selfplay
- ethanjtang/GAMBIT-lichess-puzzle-positions
---
# GAMBIT: <ins>G</ins>ener<ins>a</ins>lization or <ins>M</ins>emorization? <ins>B</ins>r<ins>i</ins>ttleness <ins>T</ins>esting for Chess-Trained Language Models
[](https://arxiv.org/abs/2605.17565) <br>
[](https://github.com/ethanjtang/KinGPT) <br>
[](https://huggingface.co/datasets/ethanjtang/GAMBIT-lichess-puzzle-positions) <br>
[](https://huggingface.co/datasets/ethanjtang/GAMBIT-stockfish18-selfplay) <br>
## Variants
### KinGPT-Woodpecker
KinGPT variant trained on 13,341,057 unique puzzle positions (FEN + best move pairs).
Achieved `train loss 0.3590, val loss 0.3704` on puzzles corpus after training for ~500B tokens.
### KinGPT-Beaver
KinGPT variant trained on 54,681 unique positions generated from 1050 Stockfish 18 self-play games.
Achieved `train loss 0.0974, val loss 1.7554` (overfitting due to small dataset size) on selfplay corpus after training for ~25B tokens.
### KinGPT-Chimera
KinGPT variant trained on combined dataset of 13,395,738 Woodpecker and Beaver variant positions.
Achieved `train loss 0.3594, val loss 0.3710` on combined corpus after training for ~500B tokens.
## Citation
```bibtex
@misc{tang2026generalizationmemorizationbrittlenesstesting,
title={Generalization or Memorization? Brittleness Testing for Chess-Trained Language Models},
author={Ethan Tang},
year={2026},
eprint={2605.17565},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2605.17565},
}
``` |