metadata
license: apache-2.0
language:
- en
library_name: pytorch
tags:
- physics
- next-frame-prediction
- gpt
- mup
- rigid-body-dynamics
- icml-2026
gpt-physics
A small GPT trained from scratch to predict 2D rigid body physics trajectories. Part of an ICML-2026 study on whether language models can learn physical dynamics from text-encoded simulation data.
Model details
- Architecture: 6-layer GPT, learned positional embeddings, tied LM head
- Tokenizer: digit-level
PhysicsTokenizer(custom) - Scaling: muP for hyperparameter transfer
- Training: curriculum learning over 5 difficulty stages
- Task: autoregressive next-frame prediction over 200-frame rigid-body scenes
- Domain: 2D rigid body dynamics simulated with Pymunk / Chipmunk2D
Files
best_model.pt— best validation checkpoint (~69 MB)checkpoint_latest.pt— latest training step (~158 MB)checkpoint_epoch0_step500.pt— early checkpoint (~158 MB)
State dicts contain raw transformer.* and lm_head.* keys for a stock 6-layer GPT — load with the project's src/scratch/gpt.py model class.
Training data
Trained on ~900K scenes across 24 "seen" scenario types (collisions, stacking, ramps, constraints, mini-games, complex). See physics-scenarios-packed and physics-scenarios-raw.
Intended use
Research on whether autoregressive LMs can internalize physical dynamics. Not intended for production physics simulation — use Pymunk for that.
Citation
ICML-2026 submission (in progress).