Adam Qwen3-8B AO Reupload: LatentQA + Classification + Past Lens

This repo is a ceselder re-upload of Adam Karvonen's checkpoints_latentqa_cls_past_lens_addition_Qwen3-8B activation-oracle adapter so the collection can carry a detailed model card.

What This Checkpoint Is

Source repo: adamkarvonen/checkpoints_latentqa_cls_past_lens_addition_Qwen3-8B
Base model: Qwen/Qwen3-8B
Adapter format: PEFT LoRA
Hook layer: 1
Activation readout layer options in ao_config.json: [9], [18], [27] corresponding to 25/50/75% depth
Seed: 42
Learning rate: 1e-5
Train batch size: 16
Epochs: 1
Paper shorthand token label: ~60M

Best-Available Training Mixture

This checkpoint's own ao_config.json lists the following dataset loaders:

past_lens single-activation loader: num_train=100000, max_k_activations=1, directions=[past,future], max_length=512
past_lens multi-activation loader: num_train=100000, max_k_activations=50, directions=[past,future], max_length=512
latentqa: num_train=100000, max_window_size=3, position_types=[all,window]
classification families, each with a single-token loader (max_window_size=1) and a multi-token loader (max_window_size=50):
- geometry_of_truth
- relations
- sst2
- md_gender
- snli
- ag_news
- ner
- tense
- language_identification
- singular_plural (listed with num_train=0 in the checkpoint config)

Documented Aggregate Stats

Adam's local count_training_data.py comments document the closest default mixture as:

total samples: 1,027,328
total tokens: 66,469,521
past_lens: 584,488 samples, 42,003,254 tokens
classification (all): 378,000 samples, 18,407,443 tokens
latentqa (all): 64,840 samples, 6,058,824 tokens

Notes

The checkpoint card is based on the source repo's ao_config.json plus Adam's local training-data counting script in the Activation Oracles reference code.
This repo is a mirror of the adapter weights and config, with only the README replaced.

Downloads last month: 143

Model tree for ceselder/adam-reupload-qwen3-8b-latentqa-cls-past-lens

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Adapter

(1072)

this model

Collection including ceselder/adam-reupload-qwen3-8b-latentqa-cls-past-lens

CoT Oracle Paper Ablations And Baselines

Collection

All models used for my LessWrong post. Generally recommended to use latest adam oracle, or the checkpoint confusingly labelled "no DPO" • 8 items • Updated 21 days ago