Text Generation
English
chess
puzzles
chess-games
stockfish
fen
best-move
uci
san
text-generation-inference
KinGPT / README.md
ethanjtang's picture
Update README.md
f0c113d verified
metadata
license: mit
language:
  - en
pipeline_tag: text-generation
tags:
  - chess
  - puzzles
  - chess-games
  - stockfish
  - fen
  - best-move
  - uci
  - san
  - text-generation-inference
datasets:
  - ethanjtang/GAMBIT-stockfish18-selfplay
  - ethanjtang/GAMBIT-lichess-puzzle-positions

GAMBIT: Generalization or Memorization? Brittleness Testing for Chess-Trained Language Models

arXiv
GitHub
HuggingFace
HuggingFace

Variants

KinGPT-Woodpecker

KinGPT variant trained on 13,341,057 unique puzzle positions (FEN + best move pairs).

Achieved train loss 0.3590, val loss 0.3704 on puzzles corpus after training for ~500B tokens.

KinGPT-Beaver

KinGPT variant trained on 54,681 unique positions generated from 1050 Stockfish 18 self-play games.

Achieved train loss 0.0974, val loss 1.7554 (overfitting due to small dataset size) on selfplay corpus after training for ~25B tokens.

KinGPT-Chimera

KinGPT variant trained on combined dataset of 13,395,738 Woodpecker and Beaver variant positions.

Achieved train loss 0.3594, val loss 0.3710 on combined corpus after training for ~500B tokens.

Citation

@misc{tang2026generalizationmemorizationbrittlenesstesting,
      title={Generalization or Memorization? Brittleness Testing for Chess-Trained Language Models}, 
      author={Ethan Tang},
      year={2026},
      eprint={2605.17565},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2605.17565}, 
}