Doc-to-LoRA: Learning to Instantly Internalize Contexts
Paper β’ 2602.15902 β’ Published β’ 4
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Sakana AI Doc-to-LoRA hypernetwork trained on Qwen2.5-Omni thinker (dense, 7.6B params).
Early checkpoint (5K/80K steps). The hypernetwork learns to generate LoRA structure that modifies model behavior, but does not yet encode specific facts from documents. Full training (80K Phase 1 + 20K Phase 2) is needed for factual encoding.
from ctx_to_lora.modeling.hypernet import ModulatedPretrainedModel
import torch
state_dict = torch.load("checkpoint-5000/pytorch_model.bin", weights_only=False)
model = ModulatedPretrainedModel.from_state_dict(state_dict, train=False, use_sequence_packing=False)
model.internalize("Your document text here...")
# Now generate β model has internalized the document via LoRA
This is one component of the HyperMod memory system for Claudia.