AlexWortega commited on
Commit
4d75a2c
·
verified ·
1 Parent(s): ad9080f

Upload gpt-physics

Browse files
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: pytorch
6
+ tags:
7
+ - physics
8
+ - next-frame-prediction
9
+ - gpt
10
+ - mup
11
+ - rigid-body-dynamics
12
+ - icml-2026
13
+ ---
14
+
15
+ # gpt-physics
16
+
17
+ A small GPT trained from scratch to predict 2D rigid body physics trajectories. Part of an ICML-2026 study on whether language models can learn physical dynamics from text-encoded simulation data.
18
+
19
+ ## Model details
20
+
21
+ - **Architecture**: 6-layer GPT, learned positional embeddings, tied LM head
22
+ - **Tokenizer**: digit-level `PhysicsTokenizer` (custom)
23
+ - **Scaling**: muP for hyperparameter transfer
24
+ - **Training**: curriculum learning over 5 difficulty stages
25
+ - **Task**: autoregressive next-frame prediction over 200-frame rigid-body scenes
26
+ - **Domain**: 2D rigid body dynamics simulated with Pymunk / Chipmunk2D
27
+
28
+ ## Files
29
+
30
+ - `best_model.pt` — best validation checkpoint (~69 MB)
31
+ - `checkpoint_latest.pt` — latest training step (~158 MB)
32
+ - `checkpoint_epoch0_step500.pt` — early checkpoint (~158 MB)
33
+
34
+ State dicts contain raw `transformer.*` and `lm_head.*` keys for a stock 6-layer GPT — load with the project's `src/scratch/gpt.py` model class.
35
+
36
+ ## Training data
37
+
38
+ Trained on ~900K scenes across 24 "seen" scenario types (collisions, stacking, ramps, constraints, mini-games, complex). See [physics-scenarios-packed](https://huggingface.co/datasets/AlexWortega/physics-scenarios-packed) and [physics-scenarios-raw](https://huggingface.co/datasets/AlexWortega/physics-scenarios-raw).
39
+
40
+ ## Intended use
41
+
42
+ Research on whether autoregressive LMs can internalize physical dynamics. Not intended for production physics simulation — use Pymunk for that.
43
+
44
+ ## Citation
45
+
46
+ ICML-2026 submission (in progress).
best_model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:986d77152b04a8cbbeab3062bffae51219b5d2df8c9f510a225f2225d05ea212
3
+ size 69389410
checkpoint_epoch0_step500.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b0f3e26125fc581e0366dc9842a5d00bcad86df315d2a8d84fa1aa0d35f01e9c
3
+ size 157848890
checkpoint_latest.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ecb31e6d101d01dd412094d5bf479572332867361231ed35737c186db9b60ad
3
+ size 157847482