| # LeWorld Memory Architecture π§ β‘ |
|
|
| A CPU-inspired hierarchical neural architecture where **3 Small LeWorld Models (SLMs)** compete to find the most useful memory for **1 Big LeWorld Model (BLM)** to predict the next world state. |
|
|
| ## Architecture |
|
|
| | Component | Parameters | Role | |
| |-----------|-----------|------| |
| | **Artificial Memory** | 21K | Bit-level storage (64K words Γ 32 bits) + learned bit encoder/decoder | |
| | **SLM-0** | 745K | State β memory address range | |
| | **SLM-1** | 745K | State β memory address range | |
| | **SLM-2** | 745K | State β memory address range | |
| | **BLM** | 11.2M | SLM selector `[1,0,1]` + next-state predictor + info requester | |
| | **Total** | **13.5M** | | |
|
|
| ## Key Ideas |
|
|
| 1. **CPU-Style Memory**: Actual bit-level storage (64K Γ 32-bit words), accessed by address ranges β just like RAM |
| 2. **Product-Key Addressing**: SLMs output addresses by predicting high byte (256 choices) + low byte (256 choices) = 65K addresses with only 512 logits |
| 3. **Binary SLM Routing**: BLM selects which SLMs to trust via Straight-Through Sigmoid β hard `[1,0,1]` in forward, differentiable in backward |
| 4. **Active Information Request**: BLM generates "what do I need next?" queries that modulate SLM memory search at the next timestep |
| 5. **3-Phase Training**: Pre-train β Joint end-to-end β Info-request refinement with paired-branch reward |
|
|
| ## Data Flow |
|
|
| ``` |
| βββββββββββββββββββββββββββββββ |
| β ARTIFICIAL MEMORY β |
| β [0][1][0][1]...[1][0][1][0] β |
| β 64K words Γ 32 bits each β |
| ββββββββββββ¬ββββββββββββββββββ-ββ |
| β READ(addr_range) |
| βββββββββββββββββββββΌββββββββββββββββββββ |
| ββββββββΌβββββββ ββββββββββΌββββββββ ββββββββΌβββββββββββ |
| β SLM-0 β β SLM-1 β β SLM-2 β |
| β (745K) β β (745K) β β (745K) β |
| β past_state β β past_state β β past_state β |
| β curr_state β β curr_state β β curr_state β |
| β character. β β character. β β character. β |
| β β addr β β β addr β β β addr β |
| ββββββββ¬βββββββ ββββββββββ¬ββββββββ ββββββββββ¬βββββββββ |
| β β β |
| ββββββββββββΊ BLM (11.2M) ββββββββββββββββ |
| mask = [1, 0, 1] |
| β next_state prediction |
| β "what info do I need next?" |
| ``` |
|
|
| ## Files |
|
|
| | File | Description | |
| |------|-------------| |
| | `leworld_architecture.py` | All model definitions: Memory, SLM, BLM, full system (~990 lines) | |
| | `leworld_training.py` | 3-phase training pipeline, data generation, evaluation (~820 lines) | |
| | `PLAN.md` | Complete design document with literature references | |
|
|
| ## Quick Start |
|
|
| ```python |
| from leworld_architecture import LeWorldSystem, MemoryConfig, SLMConfig, BLMConfig |
| from leworld_training import run_training, TrainingConfig |
| |
| # Build system |
| system = LeWorldSystem(MemoryConfig(), SLMConfig(), BLMConfig()) |
| |
| # Train (3 phases: pre-train β joint β refine) |
| metrics = run_training(system, TrainingConfig()) |
| ``` |
|
|
| ## Literature Foundation |
|
|
| | Paper | What we borrowed | |
| |-------|-----------------| |
| | [Gumbel-Softmax](https://arxiv.org/abs/1611.01144) | Straight-Through sigmoid for binary routing | |
| | [Switch Transformers](https://arxiv.org/abs/2101.03961) | Gate-value scaling, load balance loss | |
| | [Product Key Memory](https://arxiv.org/abs/1907.05242) | Address decomposition into sub-keys | |
| | [LM2](https://arxiv.org/abs/2502.06049) | LSTM-style memory gates | |
| | [NAMM](https://arxiv.org/abs/2410.13166) | Binary memory eviction | |
| | [ProactAgent](https://arxiv.org/abs/2604.20572) | Paired-branch reward for retrieval decisions | |
| | [Mamba](https://arxiv.org/abs/2312.00752) | Explicit state maintenance | |
|
|
| ## Verified Results (demo run) |
|
|
| ``` |
| Phase 1: SLM loss 12.87 β 7.13, BLM loss 0.39 β 0.33 |
| Phase 2: Routing becomes diverse β SLM usage: [0.72, 0.79, 0.67] |
| Phase 3: Info-request improves predictions by 19.5 loss units vs baseline |
| |
| Final: MSE=0.36, Routing entropy=0.70 |
| Per-step MSE: [0.64, 0.44, 0.31, 0.23, 0.19] β improves over time |
| Routing patterns: [1,0,1] β [0,1,1] β [1,1,1] β [1,1,0] β [0,1,0] |
| ``` |
|
|