LeWorld Memory Architecture π§ β‘
A CPU-inspired hierarchical neural architecture where 3 Small LeWorld Models (SLMs) compete to find the most useful memory for 1 Big LeWorld Model (BLM) to predict the next world state.
Architecture
| Component |
Parameters |
Role |
| Artificial Memory |
21K |
Bit-level storage (64K words Γ 32 bits) + learned bit encoder/decoder |
| SLM-0 |
745K |
State β memory address range |
| SLM-1 |
745K |
State β memory address range |
| SLM-2 |
745K |
State β memory address range |
| BLM |
11.2M |
SLM selector [1,0,1] + next-state predictor + info requester |
| Total |
13.5M |
|
Key Ideas
- CPU-Style Memory: Actual bit-level storage (64K Γ 32-bit words), accessed by address ranges β just like RAM
- Product-Key Addressing: SLMs output addresses by predicting high byte (256 choices) + low byte (256 choices) = 65K addresses with only 512 logits
- Binary SLM Routing: BLM selects which SLMs to trust via Straight-Through Sigmoid β hard
[1,0,1] in forward, differentiable in backward
- Active Information Request: BLM generates "what do I need next?" queries that modulate SLM memory search at the next timestep
- 3-Phase Training: Pre-train β Joint end-to-end β Info-request refinement with paired-branch reward
Data Flow
βββββββββββββββββββββββββββββββ
β ARTIFICIAL MEMORY β
β [0][1][0][1]...[1][0][1][0] β
β 64K words Γ 32 bits each β
ββββββββββββ¬ββββββββββββββββββ-ββ
β READ(addr_range)
βββββββββββββββββββββΌββββββββββββββββββββ
ββββββββΌβββββββ ββββββββββΌββββββββ ββββββββΌβββββββββββ
β SLM-0 β β SLM-1 β β SLM-2 β
β (745K) β β (745K) β β (745K) β
β past_state β β past_state β β past_state β
β curr_state β β curr_state β β curr_state β
β character. β β character. β β character. β
β β addr β β β addr β β β addr β
ββββββββ¬βββββββ ββββββββββ¬ββββββββ ββββββββββ¬βββββββββ
β β β
ββββββββββββΊ BLM (11.2M) ββββββββββββββββ
mask = [1, 0, 1]
β next_state prediction
β "what info do I need next?"
Files
| File |
Description |
leworld_architecture.py |
All model definitions: Memory, SLM, BLM, full system (~990 lines) |
leworld_training.py |
3-phase training pipeline, data generation, evaluation (~820 lines) |
PLAN.md |
Complete design document with literature references |
Quick Start
from leworld_architecture import LeWorldSystem, MemoryConfig, SLMConfig, BLMConfig
from leworld_training import run_training, TrainingConfig
system = LeWorldSystem(MemoryConfig(), SLMConfig(), BLMConfig())
metrics = run_training(system, TrainingConfig())
Literature Foundation
Verified Results (demo run)
Phase 1: SLM loss 12.87 β 7.13, BLM loss 0.39 β 0.33
Phase 2: Routing becomes diverse β SLM usage: [0.72, 0.79, 0.67]
Phase 3: Info-request improves predictions by 19.5 loss units vs baseline
Final: MSE=0.36, Routing entropy=0.70
Per-step MSE: [0.64, 0.44, 0.31, 0.23, 0.19] β improves over time
Routing patterns: [1,0,1] β [0,1,1] β [1,1,1] β [1,1,0] β [0,1,0]