Ouroboros-1MContext-Gemma-270m

1M-context extension of Google's Gemma 3 270M for episodic memory research.

Architecture

Parameter	Value
Model type	Gemma3ForCausalLM
Hidden size	640
Intermediate size	2048
Num layers	18
Num attention heads	4 (GQA, 1 KV head)
Head dim	256
Vocab size	262,144
Max position embeddings	1,048,576 (1M context)
Total parameters	268,098,176
VRAM (bfloat16)	0.54 GB

LoRA Adapter Configuration (for episodic memory)

Parameter	Value
Rank	4
Alpha	8
Target modules	gate_proj, up_proj, down_proj
Params per adapter	580,608
Per LoRA pair (one layer, one module)	10,752

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("MonumentalSystems/Ouroboros-1MContext-Gemma-270m", torch_dtype="bfloat16")
tokenizer = AutoTokenizer.from_pretrained("MonumentalSystems/Ouroboros-1MContext-Gemma-270m")

Citation

Used in "Continuous Memory: Zero-Forgetting Episodic Memory via Per-Memory LoRA Adapters" (ICLR 2026 submission).

W&B runs: 41uzevnn, oa8xe89e, 2v01e4e9 (project: symbiogenesis)

License

Apache 2.0 (following Gemma 3 license)

Downloads last month: 5

Safetensors

Model size

0.3B params

Tensor type

BF16

Model tree for MonumentalSystems/Ouroboros-1MContext-Gemma-270m

Base model

google/gemma-3-270m

Finetuned

(135)

this model