Doc-to-LoRA — NIAH Proof of Concept

A 127 M parameter Perceiver hypernetwork trained on Qwen/Qwen3.5-2B. Reads a document once, outputs LoRA deltas, and lets the base LLM answer questions without the document ever appearing in the context window.

Based on Doc-to-LoRA (Charakorn et al., 2026). Uses KL context distillation (Cartridges) and token-init.

Results

Metric	Value
Base model	`Qwen/Qwen3.5-2B`
Perceiver params	127 M
LoRA rank / alpha	8 / 8.0
Target module	`down_proj`
Training steps	1,400
Final CE loss	1.3218
Exact-match accuracy (NIAH)	0.0%
Training ctx length	32–256 tokens

Files

File	Description
`hypernet.pt`	Perceiver weights + full config to rebuild the class
`inference_example.py`	Self-contained script (download and run)
`training_config.json`	Training hyperparameters
`curves.png`	Loss and accuracy curves

Quick start

pip install transformers>=5.2.0 huggingface_hub torch

from huggingface_hub import hf_hub_download
import torch

ckpt = torch.load(hf_hub_download("farpluto/doc-to-lora-niah-qwen3.5-2B", "hypernet.pt"),
                   map_location="cuda", weights_only=False)
# See inference_example.py for the complete working script.

Qwen3 note

Chain-of-thought thinking is suppressed via /no_think appended to every query. Residual <think> tokens are stripped from generated output. Both techniques are harmless no-ops on non-Qwen3 models.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for farpluto/doc-to-lora-niah-qwen3.5-2B

Base model

Qwen/Qwen3.5-2B-Base

Finetuned

Qwen/Qwen3.5-2B

Adapter

(47)

this model

Papers for farpluto/doc-to-lora-niah-qwen3.5-2B

Doc-to-LoRA: Learning to Instantly Internalize Contexts

Paper • 2602.15902 • Published Feb 13 • 4

Cartridges: Lightweight and general-purpose long context representations via self-study

Paper • 2506.06266 • Published Jun 6, 2025 • 7