---
license: apache-2.0
datasets:
- karpathy/fineweb-edu-100B-gpt2-token-shards
language:
- en
- ja
- es
metrics:
- accuracy
base_model:
- Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled
new_version: Drjkedwards/Recursive-Transformer-Model
library_name: transformers
---
**# Model Card for Recursive Transformer Model (RTM) / ERS PyTorch Implementation**
This is the official PyTorch implementation of the **Recursive Transformer Model (RTM)**, a novel architecture that augments standard Transformer-based systems with **recursive memory reconsideration**, **temporal decay mechanisms**, and **Persistent Memory Logic Loops (PMLL)**. It addresses "nostalgic incorrectness" (the tendency of stateless AI to retain outdated or contradictory beliefs) by maintaining coherent, self-correcting state across inference sessions. The production-grade reference implementation is the **Enhanced Reconsideration System (ERS)** library, which includes PyTorch components for embeddings, lattice-based tensor routing, multi-petal attention, and knowledge-graph integration.120
The Kaggle-hosted PyTorch model provides the core RTM/ERS runtime (including `PMLLLattice`, `MemoryBlock`, temporal decay, consensus, and contradiction detection) for integration with any LLM/transformer stack. It is **not** a standalone pretrained language model but a stateful memory layer/framework.
---
## Model Details
### Model Description
The Recursive Transformer Model (RTM) extends the classic Transformer architecture with:
- **Adaptive temporal decay** on memory confidence.
- **Multi-dimensional consensus** via embedding-space geometry and knowledge graphs.
- **Vector-based contradiction detection** with integrated rewrite capabilities.
- **Persistent Memory Logic Loops (PMLL)**: a lattice-based DAG for compressed, low-rank tensor routing and recursive passes over memory slots.
Key innovations solve the stateless limitation of standard transformers by enabling iterative, multi-pass reconsideration of beliefs during inference. The Enhanced Reconsideration System (ERS) is the complete, production-ready Python/PyTorch reference implementation.1743
- **Developed by:** Dr. Josef “Q.” Edwards (Josef Kurk Edwards / josefedwards / drQedwards), University of Colorado Boulder
- **Funded by [optional]:** U.S. Department of Defense (funder identifier 100000005)
- **Shared by [optional]:** Josef Edwards (via Kaggle and GitHub)
- **Model type:** Recursive Transformer extension / stateful memory framework (PMLL + ERS)
- **Language(s) (NLP):** Language-agnostic (works with any text/embedding-based input; primarily demonstrated on English factual/knowledge-base tasks)
- **License:** MIT (see ERS repository)
- **Finetuned from model [optional]:** Not finetuned; augments any base Transformer (integrates with sentence-transformers, LangChain, etc.)
### Model Sources [optional]
- **Repository:** [Kaggle Model](https://www.kaggle.com/models/josefedwards/recursive-transformer-model/pyTorch) • [GitHub ERS (primary implementation)](https://github.com/drqedwards/ERS) • [GitHub PMLL_archive](https://github.com/drqedwards/PMLL_archive)
- **Paper:** Edwards, J. K. (2025). *The Recursive Transformer Model: Architecture, Theory, and Implementation with Persistent Memory Logic Loops*. TechRxiv. DOI: 10.36227/techrxiv.176118936.69886233/v1 (October 23, 2025)20
- **Demo [optional]:** See ERS README quick-start example (async memory reconsideration loop)
## Uses
### Direct Use
Use as a drop-in memory layer for any Transformer/LLM pipeline:
- Add factual or conversational memories.
- Run recursive reconsideration loops (temporal decay → consensus → contradiction detection → optional rewrite).
- Persist state across sessions via JSON + safetensors.
Ideal for agents, chatbots, or knowledge-intensive applications that require long-term coherence.
### Downstream Use [optional]
- Integrate with LangChain agents or any LLM stack via Graphiti/Mem0 knowledge graphs.
- Extend base models (e.g., Llama, Mistral) with stateful recursive passes.
- Use in production AI systems needing self-correction and belief updating.
### Out-of-Scope Use
- Not intended as a standalone generative LLM.
- Not suitable for real-time low-latency inference without hardware acceleration (multiple recursive passes add compute).
- Avoid use in safety-critical systems without additional ethical/guardrail layers (rewrites can be LLM-guided).
## Bias, Risks, and Limitations
- **Technical limitations:** Recursive loops increase inference-time compute; performance depends on embedding quality and KG backend (Neo4j recommended for Graphiti).
- **Sociotechnical risks:** Automated memory rewrites could propagate or amplify biases present in the underlying LLM or knowledge graph. Contradiction detection relies on embedding geometry and may miss subtle nuances.
- **Nostalgic incorrectness mitigation:** The core goal is to *reduce* outdated beliefs, but incorrect source data or poor consensus thresholds can still lead to erroneous updates.
### Recommendations
Users should:
- Monitor rewrite logs and confidence deltas.
- Use high-quality, verified knowledge graphs.
- Apply domain-specific safety policies before committing rewrites.
- Test with synthetic contradictory memory scenarios to validate behavior.
## How to Get Started with the Model
```python
# Via Kaggle (PyTorch model) or direct from ERS GitHub
# Install dependencies (from ERS README)
# pip install torch sentence-transformers safetensors mem0-ai graphiti-core langchain langchain-community
import asyncio
from ERS import EnhancedReconsiderationSystem, MemoryBlock, ERSPromise # or load from Kaggle PyTorch weights
async def main():
ers = EnhancedReconsiderationSystem() # loads saved state if present
await ers.add_memory("Paris is the capital of France")
await ers.add_memory("Paris is the largest city in France") # contradictory example
await ers.reconsider_deferred()
await ers.recursive_loop_check() # performs RTM-style multi-pass reconsideration
await ers.close()
asyncio.run(main())
```
Full usage and configuration in the [ERS GitHub README](https://github.com/drqedwards/ERS). The Kaggle PyTorch model loads the core `PMLLLattice` and related tensors.
## Training Details
### Training Data
None (this is an architectural extension/framework, not a pretrained LLM). It operates on top of any Transformer embeddings (e.g., via `sentence-transformers`). Memory content is user-provided or agent-generated.
### Training Procedure
#### Preprocessing [optional]
Memory blocks are created with embeddings (via sentence-transformers), timestamps, confidence scores, and SHA-256 hashes. Optional KG indexing via Graphiti/Mem0.
#### Training Hyperparameters
- **Training regime:** Not applicable (no end-to-end training). Runtime inference uses PyTorch (fp32/bf16 supported via torch).
- Configuration options (RTM integration): `passes: 2`, `early_stop_cosine_delta: 0.002`, `max_rewrites_per_slot: 1`, `decay_alpha: 0.95`, adaptive λ decay rates, similarity threshold τ_sim, etc. (fully configurable in ERS).43
#### Speeds, Sizes, Times [optional]
Real-time performance demonstrated in ERS (production-grade). Exact throughput depends on hardware, number of recursive passes, and KG backend. Lattice uses low-rank compression for scalability.
## Evaluation
### Testing Data, Factors & Metrics
No public benchmark datasets or quantitative results published in the preprint. Evaluation is qualitative/conceptual via synthetic contradictory memory scenarios (e.g., Paris facts example) and convergence metrics (confidence delta, rewrite count, cosine similarity shifts).
#### Factors
- Memory age, source quality, domain volatility, embedding similarity.
#### Metrics
- Nostalgic Incorrectness (NI) metric defined in paper.
- Consensus score, contradiction score, confidence update delta.
### Results
[More Information Needed] — Paper focuses on theoretical framework and architectural feasibility rather than large-scale empirical benchmarks. ERS demonstrates real-time recursive reconsideration.
#### Summary
The model successfully maintains coherent state and resolves contradictions in controlled memory scenarios.
## Model Examination [optional]
Interpretability is built-in: per-pass logs of embedding shifts, confidence changes, rewrite proposals, and KG updates. Visualize memory graph evolution (planned roadmap feature).
## Environmental Impact
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** [More Information Needed] (tested on standard CPU/GPU with PyTorch)
- **Hours used:** [More Information Needed]
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]
## Technical Specifications [optional]
### Model Architecture and Objective
- Base: Transformer stack with augmented embedding layer and reconsideration head.
- Key equations: temporal decay \( \text{conf}_i(t) = \text{conf}_i(0) \cdot e^{-\lambda_i (t - t_i)} \cdot \dots \), consensus scoring, integrated confidence update, PMLL lattice (DAG with quantization and low-rank compression).
- Objective: Stateful, self-correcting memory across sessions.
### Compute Infrastructure
#### Hardware
Standard PyTorch-compatible (CPU/GPU).
#### Software
Python 3.8+, PyTorch, sentence-transformers, safetensors, mem0-ai, graphiti-core, LangChain.
## Citation [optional]
**BibTeX:**
```bibtex
@article{edwards2025recursive,
author = {Edwards, Josef Kurk},
title = {The Recursive Transformer Model: Architecture, Theory, and Implementation with Persistent Memory Logic Loops},
journal = {TechRxiv},
year = {2025},
month = {October},
doi = {10.36227/techrxiv.176118936.69886233/v1},
url = {https://www.techrxiv.org/users/856117/articles/1345789}
}
```
**APA:**
Edwards, J. K. (2025). The Recursive Transformer Model: Architecture, Theory, and Implementation with Persistent Memory Logic Loops. TechRxiv. https://doi.org/10.36227/techrxiv.176118936.69886233/v1
## Glossary [optional]
- **PMLL**: Persistent Memory Logic Loop — lattice-based memory compression and routing.
- **ERS**: Enhanced Reconsideration System — production Python/PyTorch library.
- **Nostalgic Incorrectness**: Retention of outdated/conflicting beliefs in stateless models.
## More Information [optional]
- Full paper and math: TechRxiv preprint.
- Live implementation: [ERS GitHub](https://github.com/drqedwards/ERS).
- Related work: Hybrid TRM-RTM model, PMLL P=NP proof paper (separate preprint).
## Model Card Authors [optional]
Compiled by Dr Q based on public sources from Josef Edwards / Dr. Q.
## Model Card Contact
Josef Edwards (Kaggle: josefedwards, GitHub: drqedwards, Email: joed6834@colorado.edu) or open an issue on the ERS repository.