--- language: - en license: apache-2.0 library_name: transformers tags: - rubirlm - causal-lm - base-model - text-generation - 1b - moe datasets: - HuggingFaceFW/fineweb - HuggingFaceH4/ultrachat_200k pipeline_tag: text-generation --- # RubiRLM-1B-Base **RubiRLM-1B-Base** is a **1B-parameter base language model** released by **DevHunterAI**. **Model size: 1B parameters** **Training datasets:** FineWeb, UltraChat-200k **Model type:** Base / pretrained language model **Important:** This release is a **base model**. It can be used for prompt-based generation and experimental chat-style interaction, but it is **not an instruction-tuned chat assistant**. ## Architecture ![RubiRLM 1B Architecture](architecture.png) **RubiRLM 1B** uses a recursive language modeling architecture with recurrent state flow, Mixture-of-Experts routing, and conditional block execution. ## Key Features - **1B parameters** - **Recursive Language Model (RLM)** architecture - **10 recursive blocks** - **d_model = 1024** - **16 attention heads** - **max sequence length = 2048** - **6 recursive reasoning steps** - **Mixture-of-Experts: 32 experts, top-1 routing** - **Layer skip router for conditional execution** - **Packed execution support** - **Tied token embedding and LM head** ## Training Data This model was trained using a mixture of: - **FineWeb** - **UltraChat-200k** ## Intended Usage This model is intended for: - base language modeling research - continued pretraining - experimental prompt-based generation - architecture experimentation around recursive and MoE-based language models ## Not Intended As This release should **not** be treated as: - a fully aligned assistant - a safety-tuned production chatbot - an instruction-following model with guaranteed conversational quality ## Loading Because this repository includes custom model code, loading may require `trust_remote_code=True` depending on your workflow. ## Files - `pytorch_model.bin`: exported RubiRLM weights - `training_checkpoint.pt`: original training checkpoint - `config.json`: Hugging Face-facing config - `rubirlm_config.json`: full RubiRLM architecture config - `RubiRLM.py`: model implementation - `xqs_moe.py`, `xqs_stack.py`, `x_quantum_sparse_ops.py`, `rubi_train_stack.py`: supporting code ## Notes The exported weights were produced from the final training checkpoint and packaged for Hugging Face publication.