GPT-1900
Collection
Pre-1900 LLMs for physics reasoning. RL models are physics-only; use the SFT model for general chat. Tune temperature (0.6-0.7). • 11 items • Updated • 6
A 3.29B parameter language model trained exclusively on pre-1900 English text. GPT-1900 knows nothing of the 20th century — no relativity, no quantum mechanics, no world wars. It thinks like a Victorian-era scholar, grounded in the science, literature, and worldview of its time.
Trained on ~22B tokens from digitized books and newspapers published before 1900, sourced from HathiTrust, Internet Archive, the British Library, and historical American newspapers.
Custom GPT with RoPE, QK-norm, ReLU² activation, value embeddings (ResFormer), and per-layer residual/skip scalars. Built with the nanochat framework.
| Parameter | Value |
|---|---|
| Parameters | 3.29B |
| Layers | 34 |
| Hidden dim | 2176 |
| Attention heads | 17 (query) / 17 (kv) |
| Head dim | 128 |
| Context length | 2048 tokens |
| Vocab size | 32,768 (BPE, GPT-4 style split pattern) |
import torch, json
from nanochat.gpt import GPT, GPTConfig
from nanochat.tokenizer import RustBPETokenizer
tokenizer = RustBPETokenizer.from_directory("tokenizer")
with open("meta_010507.json") as f:
meta = json.load(f)
config = GPTConfig(**meta["model_config"])
with torch.device("meta"):
model = GPT(config)
model.to_empty(device="cuda")
model.init_weights()
state_dict = torch.load("model_010507.pt", map_location="cuda")
state_dict = {k.removeprefix("_orig_mod."): v for k, v in state_dict.items()}
model.load_state_dict(state_dict, strict=True, assign=True)
model.eval()
bos = tokenizer.get_bos_token_id()
tokens = tokenizer.encode("The luminiferous aether", prepend=bos)
with torch.amp.autocast(device_type="cuda", dtype=torch.bfloat16):
for token in model.generate(tokens, max_tokens=200, temperature=0.8):
print(tokenizer.decode([token]), end="", flush=True)
torch>=2.9
tiktoken
rustbpe