BOB12311
/

tracelock-dream-ae

Feature Extraction

diffusion-language-model

activation-autoencoder

Model card Files Files and versions

tracelock-dream-ae / README.md

BOB12311's picture

Improve model card and add metadata (#1)

27eeb10 about 20 hours ago

|

history blame contribute delete

2.7 kB

	---
	library_name: pytorch
	license: apache-2.0
	pipeline_tag: feature-extraction
	tags:
	- tracelock
	- dream
	- diffusion-language-model
	- activation-autoencoder
	- pytorch
	---

	# TraceLock Dream Activation Autoencoder

	This repository contains the projection autoencoder checkpoint used to reproduce TraceLock on Dream, as presented in the paper [The Path Matters: Learning a Token-Commitment Policy for Diffusion Language Models](https://huggingface.co/papers/2605.24697).

	Code: [https://github.com/BobSun98/TraceLock](https://github.com/BobSun98/TraceLock)

	TraceLock is a token-level acceptance policy for Dream-style masked diffusion generation. Dream proposes candidate tokens during the denoising loop, and TraceLock decides which positions should be locked now versus kept masked for later refinement.

	## What This Checkpoint Is

	`best_val_loss.pt` is an activation autoencoder for Dream hidden states. It compresses the last three Dream hidden-state snapshots and two hidden-state deltas into compact features consumed by the TraceLock policy model.

	This checkpoint is not a text generation model and does not contain Dream model weights. Users still need to download Dream from its original repository:

	```text
	Dream-org/Dream-v0-Instruct-7B
	```

	## How It Is Used

	After downloading this repository into a TraceLock workspace, the expected local path is:

	```text
	$TRACELOCK_HOME/checkpoints/dream-ae-v1/best_val_loss.pt
	```

	TraceLock uses this checkpoint in two places:

	1. `generate_training_traces.sh`: projects Dream activations while building training traces.
	2. `train.sh` / evaluation: reconstructs the same projection stack expected by the TraceLock policy.

	## Architecture

	The released checkpoint was trained with:

	```json
	{
	"d_model": 3584,
	"d_hidden_bottleneck": 256,
	"d_delta_bottleneck": 32,
	"dropout": 0.1
	}
	```

	The exported projection state contains:

	- hidden-state normalization
	- delta-state normalization
	- hidden-state projection encoder
	- delta-state projection encoder

	## Files

	- `best_val_loss.pt`: projection autoencoder checkpoint.
	- `config.json`: training/configuration metadata for this autoencoder run.
	- `data_stats.json`: basic sample count and batch metadata from the run.

	## Citation

	If you use this checkpoint, please cite the following paper:

	```bibtex
	@misc{sun2026pathmatters,
	title={The Path Matters: Learning a Token-Commitment Policy for Diffusion Language Models},
	author={Bohang Sun and Max Zhu and Francesco Caso and Jindong Gu and Junchi Yu and Philip Torr and Pietro Liò and Jialin Yu},
	year={2026},
	eprint={2605.24697},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2605.24697}
	}
	```