Adam Qwen3-8B AO Reupload: LatentQA + Classification + Past Lens

This repo is a ceselder re-upload of Adam Karvonen's checkpoints_latentqa_cls_past_lens_addition_Qwen3-8B activation-oracle adapter so the collection can carry a detailed model card.

What This Checkpoint Is

  • Source repo: adamkarvonen/checkpoints_latentqa_cls_past_lens_addition_Qwen3-8B
  • Base model: Qwen/Qwen3-8B
  • Adapter format: PEFT LoRA
  • Hook layer: 1
  • Activation readout layer options in ao_config.json: [9], [18], [27] corresponding to 25/50/75% depth
  • Seed: 42
  • Learning rate: 1e-5
  • Train batch size: 16
  • Epochs: 1
  • Paper shorthand token label: ~60M

Best-Available Training Mixture

This checkpoint's own ao_config.json lists the following dataset loaders:

  • past_lens single-activation loader: num_train=100000, max_k_activations=1, directions=[past,future], max_length=512
  • past_lens multi-activation loader: num_train=100000, max_k_activations=50, directions=[past,future], max_length=512
  • latentqa: num_train=100000, max_window_size=3, position_types=[all,window]
  • classification families, each with a single-token loader (max_window_size=1) and a multi-token loader (max_window_size=50):
    • geometry_of_truth
    • relations
    • sst2
    • md_gender
    • snli
    • ag_news
    • ner
    • tense
    • language_identification
    • singular_plural (listed with num_train=0 in the checkpoint config)

Documented Aggregate Stats

Adam's local count_training_data.py comments document the closest default mixture as:

  • total samples: 1,027,328
  • total tokens: 66,469,521
  • past_lens: 584,488 samples, 42,003,254 tokens
  • classification (all): 378,000 samples, 18,407,443 tokens
  • latentqa (all): 64,840 samples, 6,058,824 tokens

Notes

  • The checkpoint card is based on the source repo's ao_config.json plus Adam's local training-data counting script in the Activation Oracles reference code.
  • This repo is a mirror of the adapter weights and config, with only the README replaced.
Downloads last month
143
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ceselder/adam-reupload-qwen3-8b-latentqa-cls-past-lens

Finetuned
Qwen/Qwen3-8B
Adapter
(1072)
this model

Collection including ceselder/adam-reupload-qwen3-8b-latentqa-cls-past-lens